qiskit
qiskit copied to clipboard
Rework preset pass mangers levels
What is the expected enhancement?
Right now we have 4 optimization levels 0-3 that right now are split like:
Optimization Level 0: No optimization passes run, just trivial layout -> stochastic swap -> basis translator
Optimization Level 1: Trivial Layout/Dense layout -> stochastic swap -> basis translator -> optimization loop (optimize 1q gates and cx cancellation)
Optimization Level 2: CSPLayout/Trivial Layout/Dense layout -> stochastic swap -> basis translator -> optimization loop (optimize 1q gates and commutative cancellation)
Optimization Level 3: CSPLayout/Trivial Layout/Dense layout -> stochastic swap -> basis translator -> optimization loop (2q block collection and synthesis, optimize 1q gates and commutative cancellation)
Also passes that are shared between levels and have configurable numbers of trials use more at higher optimization levels.
However with recent performance improvements we've made in Qiskit the past couple of years we have some time budget to improve the quality of the circuits we output at level 1 and add more passes to the higher optimization levels.
This is related to #7016 where some simple hand optimizations were not being performed at the default level 1. I think we need to characterize the runtime performance of the different passes and compare the tradeoff in utility they provide and once we have a better sense for this in data we can make informed decision.
The rough idea I had was basically to shift everything >1 down one level so that level 1's optimization loop has commutative cancellation, level 2 has the 2q collection and synthesis, and level 3 we add more optimization passes to like template optimization (would require performance work prior to adding). But this would all need to be measured so we can make informed decisions instead of picking blindly.
For 0.19, introducing sabre
as the default {layout,routing}_method
(following #7036) maybe at higher optimization_level
s seems a place to start.
The default should be O3 with Sabre layout and routing. Otherwise users get sub-optimal performance across most circuits, especially when compared to other compilers. This shows up again and again in benchmarks from others.