qiskit icon indicating copy to clipboard operation
qiskit copied to clipboard

Fully port Optimize1qGatesDecomposition to Rust

Open mtreinish opened this issue 1 year ago • 6 comments

Summary

This commit builds off of #12550 and the other data model in Rust infrastructure and migrates the Optimize1qGatesDecomposition pass to operate fully in Rust. The full path of the transpiler pass now never leaves rust until it has finished modifying the DAGCircuit. There is still some python interaction necessary to handle parts of the data model that are still in Python, mainly calibrations and parameter expressions (for global phase). But otherwise the entirety of the pass operates in rust now.

This is just a first pass at the migration here, it moves the pass to be a single for loop in rust. The next steps here are to look at operating the pass in parallel. There is no data dependency between the optimizations being done by the pass so we should be able to the throughput of the pass by leveraging multithreading to handle each run in parallel. This commit does not attempt this though, because of the Python dependency and also the data structures around gates and the dag aren't really setup for multithreading yet and there likely will need to be some work to support that (this pass is a good candidate to work through the bugs on that).

Details and comments

Part of #12208

This PR is based on top of #12550 and as such github shows the entire contents of #12550 in addition to the contents of this PR. To see the contents of this PR you can look at HEAD on this branch, or just look at: https://github.com/Qiskit/qiskit/pull/12959/commits/0c8b668f1d16783ed864b6dcbf3366692fce82b3

TODO:

  • [x] Rebase after #12550 merges
  • [x] Benchmark and tune the implementation
  • [x] Investigate if there is any dead code in the euler_one_qubit_decomposer module

mtreinish avatar Aug 14 '24 16:08 mtreinish

One or more of the following people are relevant to this code:

  • @enavarro51
  • @Qiskit/terra-core
  • @kevinhartman
  • @levbishop
  • @mtreinish
  • @nkanazawa1989

qiskit-bot avatar Aug 14 '24 16:08 qiskit-bot

I ran some asv benchmarks on this and it's yielding very good looking performance improvements:

Benchmarks that have improved:

| Change   | Before [76eb568c] <use-dag-1q-decompose~9^2>   | After [7eaa9544] <use-dag-1q-decompose>   |   Ratio | Benchmark (Parameter)                                                                                         |
|----------|------------------------------------------------|-------------------------------------------|---------|---------------------------------------------------------------------------------------------------------------|
| -        | 5.46±0.02ms                                    | 1.23±0ms                                  |    0.23 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(5, 1024, ['rx', 'ry', 'rz', 'r', 'rxx', 'id'])  |
| -        | 13.9±0.06ms                                    | 2.96±0.04ms                               |    0.21 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(14, 1024, ['rx', 'ry', 'rz', 'r', 'rxx', 'id']) |
| -        | 25.4±0.2ms                                     | 4.02±0.02ms                               |    0.16 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(20, 1024, ['rx', 'ry', 'rz', 'r', 'rxx', 'id']) |
| -        | 4.11±0.01ms                                    | 595±20μs                                  |    0.14 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(5, 1024, ['u', 'cx', 'id'])                     |
| -        | 16.4±0.03ms                                    | 2.05±0.01ms                               |    0.13 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(20, 1024, ['u', 'cx', 'id'])                    |
| -        | 5.85±0.01ms                                    | 782±10μs                                  |    0.13 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(5, 1024, ['rz', 'x', 'sx', 'cx', 'id'])         |
| -        | 15.6±0.06ms                                    | 1.95±0.02ms                               |    0.12 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(14, 1024, ['rz', 'x', 'sx', 'cx', 'id'])        |
| -        | 11.9±0.04ms                                    | 1.47±0.02ms                               |    0.12 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(14, 1024, ['u', 'cx', 'id'])                    |
| -        | 30.4±0.3ms                                     | 2.65±0.04ms                               |    0.09 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(20, 1024, ['rz', 'x', 'sx', 'cx', 'id'])        |

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.

mtreinish avatar Aug 14 '24 16:08 mtreinish

Pull Request Test Coverage Report for Build 10634728662

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 306 of 351 (87.18%) changed or added relevant lines in 6 files are covered.
  • 230 unchanged lines in 8 files lost coverage.
  • Overall coverage decreased (-0.06%) to 89.132%

Changes Missing Coverage Covered Lines Changed/Added Lines %
crates/circuit/src/dag_circuit.rs 92 114 80.7%
crates/accelerate/src/euler_one_qubit_decomposer.rs 189 212 89.15%
<!-- Total: 306 351
Files with Coverage Reduction New Missed Lines %
crates/qasm2/src/expr.rs 1 94.02%
crates/accelerate/src/two_qubit_decompose.rs 1 90.84%
qiskit/transpiler/passes/optimization/optimize_1q_decomposition.py 2 94.17%
crates/qasm2/src/lex.rs 4 92.23%
crates/accelerate/src/euler_one_qubit_decomposer.rs 9 91.1%
crates/qasm2/src/parse.rs 12 97.15%
crates/circuit/src/circuit_data.rs 49 90.54%
crates/circuit/src/operations.rs 152 88.26%
<!-- Total: 230
Totals Coverage Status
Change from base Build 10598013395: -0.06%
Covered Lines: 71802
Relevant Lines: 80557

💛 - Coveralls

coveralls avatar Aug 14 '24 17:08 coveralls

Now that #12550 has merged I've rebased this and it's ready for review.

mtreinish avatar Aug 23 '24 17:08 mtreinish

Now that this is ready for merging I ran two benchmark runs for this pass against main:

Benchmarks that have improved:

| Change   | Before [d3040a0b] <main>            | After [2c0aad55] <use-dag-1q-decompose>   |   Ratio | Benchmark (Parameter)                                                                                         |
|----------|-------------------------------------|-------------------------------------------|---------|---------------------------------------------------------------------------------------------------------------|
| -        | 5.50±0.03ms                         | 1.24±0ms                                  |    0.23 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(5, 1024, ['rx', 'ry', 'rz', 'r', 'rxx', 'id'])  |
| -        | 14.0±0.08ms                         | 2.95±0.02ms                               |    0.21 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(14, 1024, ['rx', 'ry', 'rz', 'r', 'rxx', 'id']) |
| -        | 24.3±0.2ms                          | 4.11±0.02ms                               |    0.17 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(20, 1024, ['rx', 'ry', 'rz', 'r', 'rxx', 'id']) |
| -        | 4.12±0.03ms                         | 572±1μs                                   |    0.14 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(5, 1024, ['u', 'cx', 'id'])                     |
| -        | 15.9±0.06ms                         | 2.01±0.01ms                               |    0.13 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(20, 1024, ['u', 'cx', 'id'])                    |
| -        | 5.99±0.02ms                         | 768±4μs                                   |    0.13 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(5, 1024, ['rz', 'x', 'sx', 'cx', 'id'])         |
| -        | 15.7±0.2ms                          | 1.92±0.03ms                               |    0.12 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(14, 1024, ['rz', 'x', 'sx', 'cx', 'id'])        |
| -        | 11.8±0.05ms                         | 1.44±0.01ms                               |    0.12 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(14, 1024, ['u', 'cx', 'id'])                    |
| -        | 28.1±0.2ms                          | 2.63±0.01ms                               |    0.09 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(20, 1024, ['rz', 'x', 'sx', 'cx', 'id'])        |

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.

and against 1.2.0 (to discount any potential performance change from https://github.com/Qiskit/qiskit/pull/12550):

Benchmarks that have improved:

| Change   | Before [7e2e835e] <1.2.0^0>   | After [2c0aad55] <use-dag-1q-decompose>   |   Ratio | Benchmark (Parameter)                                                                                         |
|----------|-------------------------------|-------------------------------------------|---------|---------------------------------------------------------------------------------------------------------------|
| -        | 5.53±0.02ms                   | 1.23±0.01ms                               |    0.22 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(5, 1024, ['rx', 'ry', 'rz', 'r', 'rxx', 'id'])  |
| -        | 14.0±0.1ms                    | 2.98±0.03ms                               |    0.21 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(14, 1024, ['rx', 'ry', 'rz', 'r', 'rxx', 'id']) |
| -        | 25.7±0.1ms                    | 4.08±0.03ms                               |    0.16 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(20, 1024, ['rx', 'ry', 'rz', 'r', 'rxx', 'id']) |
| -        | 4.14±0.02ms                   | 584±10μs                                  |    0.14 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(5, 1024, ['u', 'cx', 'id'])                     |
| -        | 5.93±0.02ms                   | 778±10μs                                  |    0.13 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(5, 1024, ['rz', 'x', 'sx', 'cx', 'id'])         |
| -        | 15.8±0.09ms                   | 1.95±0.03ms                               |    0.12 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(14, 1024, ['rz', 'x', 'sx', 'cx', 'id'])        |
| -        | 12.1±0.09ms                   | 1.49±0.02ms                               |    0.12 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(14, 1024, ['u', 'cx', 'id'])                    |
| -        | 16.5±0.1ms                    | 2.06±0.04ms                               |    0.12 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(20, 1024, ['u', 'cx', 'id'])                    |
| -        | 30.5±0.2ms                    | 2.61±0.05ms                               |    0.09 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(20, 1024, ['rz', 'x', 'sx', 'cx', 'id'])        |

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.

mtreinish avatar Aug 23 '24 21:08 mtreinish

I think that euler_one_qubit_decomposer.rs should be moved to crates/accelerate/src/synthesis/one_qubit to match with the structure of the current synthesis library. Similarly, it would be good if two_qubit_decompose.rs would be moved to crates/accelerate/src/synthesis/two_qubits

We can reorganize these modules in a follow up PR. When I created these modules there wasn't any other synthesis in rust so there wasn't anywhere else to put them. I think it'll probably be best to do this at the same time we create a transpiler crate so the Optimize1qGatesDecomposition function that I modify here is separate from the base synthesis code.

mtreinish avatar Aug 26 '24 10:08 mtreinish