qiskit
qiskit copied to clipboard
Fully port Optimize1qGatesDecomposition to Rust
Summary
This commit builds off of #12550 and the other data model in Rust infrastructure and migrates the Optimize1qGatesDecomposition pass to operate fully in Rust. The full path of the transpiler pass now never leaves rust until it has finished modifying the DAGCircuit. There is still some python interaction necessary to handle parts of the data model that are still in Python, mainly calibrations and parameter expressions (for global phase). But otherwise the entirety of the pass operates in rust now.
This is just a first pass at the migration here, it moves the pass to be a single for loop in rust. The next steps here are to look at operating the pass in parallel. There is no data dependency between the optimizations being done by the pass so we should be able to the throughput of the pass by leveraging multithreading to handle each run in parallel. This commit does not attempt this though, because of the Python dependency and also the data structures around gates and the dag aren't really setup for multithreading yet and there likely will need to be some work to support that (this pass is a good candidate to work through the bugs on that).
Details and comments
Part of #12208
This PR is based on top of #12550 and as such github shows the entire contents of #12550 in addition to the contents of this PR. To see the contents of this PR you can look at HEAD on this branch, or just look at: https://github.com/Qiskit/qiskit/pull/12959/commits/0c8b668f1d16783ed864b6dcbf3366692fce82b3
TODO:
- [x] Rebase after #12550 merges
- [x] Benchmark and tune the implementation
- [x] Investigate if there is any dead code in the
euler_one_qubit_decomposermodule
One or more of the following people are relevant to this code:
- @enavarro51
@Qiskit/terra-core@kevinhartman@levbishop@mtreinish@nkanazawa1989
I ran some asv benchmarks on this and it's yielding very good looking performance improvements:
Benchmarks that have improved:
| Change | Before [76eb568c] <use-dag-1q-decompose~9^2> | After [7eaa9544] <use-dag-1q-decompose> | Ratio | Benchmark (Parameter) |
|----------|------------------------------------------------|-------------------------------------------|---------|---------------------------------------------------------------------------------------------------------------|
| - | 5.46±0.02ms | 1.23±0ms | 0.23 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(5, 1024, ['rx', 'ry', 'rz', 'r', 'rxx', 'id']) |
| - | 13.9±0.06ms | 2.96±0.04ms | 0.21 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(14, 1024, ['rx', 'ry', 'rz', 'r', 'rxx', 'id']) |
| - | 25.4±0.2ms | 4.02±0.02ms | 0.16 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(20, 1024, ['rx', 'ry', 'rz', 'r', 'rxx', 'id']) |
| - | 4.11±0.01ms | 595±20μs | 0.14 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(5, 1024, ['u', 'cx', 'id']) |
| - | 16.4±0.03ms | 2.05±0.01ms | 0.13 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(20, 1024, ['u', 'cx', 'id']) |
| - | 5.85±0.01ms | 782±10μs | 0.13 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(5, 1024, ['rz', 'x', 'sx', 'cx', 'id']) |
| - | 15.6±0.06ms | 1.95±0.02ms | 0.12 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(14, 1024, ['rz', 'x', 'sx', 'cx', 'id']) |
| - | 11.9±0.04ms | 1.47±0.02ms | 0.12 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(14, 1024, ['u', 'cx', 'id']) |
| - | 30.4±0.3ms | 2.65±0.04ms | 0.09 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(20, 1024, ['rz', 'x', 'sx', 'cx', 'id']) |
SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.
Pull Request Test Coverage Report for Build 10634728662
Warning: This coverage report may be inaccurate.
This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
- For more information on this, see Tracking coverage changes with pull request builds.
- To avoid this issue with future PRs, see these Recommended CI Configurations.
- For a quick fix, rebase this PR at GitHub. Your next report should be accurate.
Details
- 306 of 351 (87.18%) changed or added relevant lines in 6 files are covered.
- 230 unchanged lines in 8 files lost coverage.
- Overall coverage decreased (-0.06%) to 89.132%
| Changes Missing Coverage | Covered Lines | Changed/Added Lines | % |
|---|---|---|---|
| crates/circuit/src/dag_circuit.rs | 92 | 114 | 80.7% |
| crates/accelerate/src/euler_one_qubit_decomposer.rs | 189 | 212 | 89.15% |
| <!-- | Total: | 306 | 351 |
| Files with Coverage Reduction | New Missed Lines | % |
|---|---|---|
| crates/qasm2/src/expr.rs | 1 | 94.02% |
| crates/accelerate/src/two_qubit_decompose.rs | 1 | 90.84% |
| qiskit/transpiler/passes/optimization/optimize_1q_decomposition.py | 2 | 94.17% |
| crates/qasm2/src/lex.rs | 4 | 92.23% |
| crates/accelerate/src/euler_one_qubit_decomposer.rs | 9 | 91.1% |
| crates/qasm2/src/parse.rs | 12 | 97.15% |
| crates/circuit/src/circuit_data.rs | 49 | 90.54% |
| crates/circuit/src/operations.rs | 152 | 88.26% |
| <!-- | Total: | 230 |
| Totals | |
|---|---|
| Change from base Build 10598013395: | -0.06% |
| Covered Lines: | 71802 |
| Relevant Lines: | 80557 |
💛 - Coveralls
Now that #12550 has merged I've rebased this and it's ready for review.
Now that this is ready for merging I ran two benchmark runs for this pass against main:
Benchmarks that have improved:
| Change | Before [d3040a0b] <main> | After [2c0aad55] <use-dag-1q-decompose> | Ratio | Benchmark (Parameter) |
|----------|-------------------------------------|-------------------------------------------|---------|---------------------------------------------------------------------------------------------------------------|
| - | 5.50±0.03ms | 1.24±0ms | 0.23 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(5, 1024, ['rx', 'ry', 'rz', 'r', 'rxx', 'id']) |
| - | 14.0±0.08ms | 2.95±0.02ms | 0.21 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(14, 1024, ['rx', 'ry', 'rz', 'r', 'rxx', 'id']) |
| - | 24.3±0.2ms | 4.11±0.02ms | 0.17 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(20, 1024, ['rx', 'ry', 'rz', 'r', 'rxx', 'id']) |
| - | 4.12±0.03ms | 572±1μs | 0.14 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(5, 1024, ['u', 'cx', 'id']) |
| - | 15.9±0.06ms | 2.01±0.01ms | 0.13 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(20, 1024, ['u', 'cx', 'id']) |
| - | 5.99±0.02ms | 768±4μs | 0.13 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(5, 1024, ['rz', 'x', 'sx', 'cx', 'id']) |
| - | 15.7±0.2ms | 1.92±0.03ms | 0.12 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(14, 1024, ['rz', 'x', 'sx', 'cx', 'id']) |
| - | 11.8±0.05ms | 1.44±0.01ms | 0.12 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(14, 1024, ['u', 'cx', 'id']) |
| - | 28.1±0.2ms | 2.63±0.01ms | 0.09 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(20, 1024, ['rz', 'x', 'sx', 'cx', 'id']) |
SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.
and against 1.2.0 (to discount any potential performance change from https://github.com/Qiskit/qiskit/pull/12550):
Benchmarks that have improved:
| Change | Before [7e2e835e] <1.2.0^0> | After [2c0aad55] <use-dag-1q-decompose> | Ratio | Benchmark (Parameter) |
|----------|-------------------------------|-------------------------------------------|---------|---------------------------------------------------------------------------------------------------------------|
| - | 5.53±0.02ms | 1.23±0.01ms | 0.22 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(5, 1024, ['rx', 'ry', 'rz', 'r', 'rxx', 'id']) |
| - | 14.0±0.1ms | 2.98±0.03ms | 0.21 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(14, 1024, ['rx', 'ry', 'rz', 'r', 'rxx', 'id']) |
| - | 25.7±0.1ms | 4.08±0.03ms | 0.16 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(20, 1024, ['rx', 'ry', 'rz', 'r', 'rxx', 'id']) |
| - | 4.14±0.02ms | 584±10μs | 0.14 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(5, 1024, ['u', 'cx', 'id']) |
| - | 5.93±0.02ms | 778±10μs | 0.13 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(5, 1024, ['rz', 'x', 'sx', 'cx', 'id']) |
| - | 15.8±0.09ms | 1.95±0.03ms | 0.12 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(14, 1024, ['rz', 'x', 'sx', 'cx', 'id']) |
| - | 12.1±0.09ms | 1.49±0.02ms | 0.12 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(14, 1024, ['u', 'cx', 'id']) |
| - | 16.5±0.1ms | 2.06±0.04ms | 0.12 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(20, 1024, ['u', 'cx', 'id']) |
| - | 30.5±0.2ms | 2.61±0.05ms | 0.09 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(20, 1024, ['rz', 'x', 'sx', 'cx', 'id']) |
SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.
I think that euler_one_qubit_decomposer.rs should be moved to crates/accelerate/src/synthesis/one_qubit to match with the structure of the current synthesis library. Similarly, it would be good if two_qubit_decompose.rs would be moved to crates/accelerate/src/synthesis/two_qubits
We can reorganize these modules in a follow up PR. When I created these modules there wasn't any other synthesis in rust so there wasn't anywhere else to put them. I think it'll probably be best to do this at the same time we create a transpiler crate so the Optimize1qGatesDecomposition function that I modify here is separate from the base synthesis code.