qiskit icon indicating copy to clipboard operation
qiskit copied to clipboard

Context-aware dynamical decoupling

Open Cryoris opened this issue 1 year ago • 3 comments

Summary

Implement context-aware dynamical decoupling (DD) from arXiv:2403.06852.

Details and comments

When selecting decoupling sequences for an idle qubit, context-aware DD takes into account what actions are occuring on neighboring qubits. It ensures that neighboring sequences are mutually orthogonal (to avoid collisions, like in staggered DD) but also whether the qubit is a CX or ECR target/control spectator. This is nicely summarized in a figure of the paper, where the different wire colors denote different decoupling sequences:

image

This particularly leads to improvement if a lot of qubits are idle at the same time or if they are next to ECR/CX gates. One important category of circuits in this class are the layers in twirled circuits. This pass also works best if the ECR/CX gate times are the same for all connections, which is almost always the case on IBM devices.

Some simple experiments show that this pass performs better than the current standard in Qiskit, PadDynamicaDecoupling with an X-X sequence. Here's some basic benchmarks, see the paper for more:

image

This code is based off @kdk and @ajavadia 🙂

Cryoris avatar Jul 26 '24 14:07 Cryoris

One or more of the following people are relevant to this code:

  • @Qiskit/terra-core

qiskit-bot avatar Jul 26 '24 14:07 qiskit-bot

Pull Request Test Coverage Report for Build 15410851778

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 356 of 381 (93.44%) changed or added relevant lines in 4 files are covered.
  • 1208 unchanged lines in 26 files lost coverage.
  • Overall coverage increased (+0.04%) to 87.874%

Changes Missing Coverage Covered Lines Changed/Added Lines %
qiskit/transpiler/passes/scheduling/padding/context_aware_dynamical_decoupling.py 353 378 93.39%
<!-- Total: 356 381
Files with Coverage Reduction New Missed Lines %
qiskit/circuit/library/pauli_evolution.py 2 96.49%
qiskit/result/sampled_expval.py 2 93.55%
crates/qasm2/src/lex.rs 3 92.48%
qiskit/transpiler/passes/routing/sabre_swap.py 3 96.97%
crates/cext/src/exit_codes.rs 4 0.0%
qiskit/synthesis/evolution/suzuki_trotter.py 4 90.38%
qiskit/synthesis/multi_controlled/mcx_synthesis.py 4 98.55%
crates/transpiler/src/passes/split_2q_unitaries.rs 5 96.62%
qiskit/synthesis/evolution/qdrift.py 5 86.49%
crates/accelerate/src/twirling.rs 6 97.33%
<!-- Total: 1208
Totals Coverage Status
Change from base Build 15300566108: 0.04%
Covered Lines: 80816
Relevant Lines: 91968

💛 - Coveralls

coveralls avatar Jul 30 '24 14:07 coveralls

Anything holding this up from being merged? Happy to lend a hand if needed to speed up the process @Cryoris @eliarbel

miamico avatar Oct 18 '24 19:10 miamico

https://github.com/Qiskit/qiskit/pull/12825#pullrequestreview-2404327735 is a very good catch -- the newly added gates that implement the DD sequences didn't have a duration attached, since they weren't present when the scheduling pass was run. I'll add a test to check for this 🙂

Cryoris avatar Dec 03 '24 17:12 Cryoris

After discussion with @ElePT I also updated the default behavior of min_joinable_duration: On realistic devices, the 2q gate lengths vary a little -- yet might not want to insert DD sequences for two concurrent 2q gates, just because they naturally differ slightly. The new default is therefore that we only add DD sequences for idle times beyond 2 * (max_2q_duration - min_2q_duration). This needs some benchmarking on real devices before merging.

Cryoris avatar Dec 03 '24 17:12 Cryoris

#12825 (review) is a very good catch -- the newly added gates that implement the DD sequences didn't have a duration attached, since they weren't present when the scheduling pass was run. I'll add a test to check for this 🙂

Just so you know I'm working on removing per instruction durations in #13506 so we probably don't need to change this now. It would have been important for 1.3, but in case 2.0 we just making extra work for both of us.

mtreinish avatar Dec 04 '24 00:12 mtreinish

I removed any setting and usage of Instruction.duration in 21c4ea0, so hopefully you don't have to update the context-aware DD in #13506 🙂 That means the timeline drawer will fail now, but I assume this will be fixed in your PR 🙂

Cryoris avatar Dec 04 '24 13:12 Cryoris

After discussion with @ElePT I also updated the default behavior of min_joinable_duration: On realistic devices, the 2q gate lengths vary a little -- yet might not want to insert DD sequences for two concurrent 2q gates, just because they naturally differ slightly. The new default is therefore that we only add DD sequences for idle times beyond 2 * (max_2q_duration - min_2q_duration). This needs some benchmarking on real devices before merging.

Hi @Cryoris are you planning to benchmark the new min_joinable_duration default on real devices before the 2.1 release?

eliarbel avatar May 05 '25 11:05 eliarbel

Yep, it's good from my side now 🙂 The number of X gates is rather large though, per design of the algorithm, I think that was just surprising to see.

E.g. for an N-local circuit with linear entanglement on n qubits (not that someone would run this), the number of X gates goes with O(n^2) instead of O(n) for "normal" DD. But the results of CA-DD look promising on the eagle devices and the overhead can be controlled with the min_joinable_duration parameter.

Cryoris avatar May 07 '25 13:05 Cryoris

Overall this looks in good shape to me both from code and testing perspectives. I haven't reviewed the algorithm in depth against the paper, but I ran several examples and the results look OK. Still, I'd like to look into some of the algorithmic aspects a bit more. In the meantime, I have a couple questions and possible suggestions, both not major:

The rationale behind the default value for min_joinable_duration is not so clear to me. Also, it doesn't feel very intuitive as a knob for controlling the amount of DD inserted into the circuit. Is there some underlying intuition behind the choice, something that could be added to the docstring to help clarify this a bit?

'cz' is included in the calculation of the 2q gate length variance in _gate_length_variance. However, the logic for handling spectator qubits does not account for CZGates. Was this intentional, perhaps due to the differences in crosstalk levels between Heron and Eagle devices? If so, this introduces a technology assumption in the implementation, isn't it? I'm wondering if it would make sense to provide a way for users to opt in or out of spectator qubit analysis for specific gate types?

eliarbel avatar May 13 '25 11:05 eliarbel

Re: min_joinable_duration -- this shouldn't set the amount of DD inserted, but rather control how adjacent idle blocks are joined. It's a knob that lets you trade-off between potentially many short sequences or longer blocks. The default value is picked to ensure that a layer of CX/ECR will not have a lot of short DD sequences just due to the difference in duration. But this is not discussed in the paper afaik and I'm happy to change this if you think there's a more reasonable choice 🙂

Re: CZ -- The paper only investigates CX/ECR. CZ would probably require a different spectator qubit handling, but this is not explained in the paper and hence not supported in the code. We can add an argument that allows to turn off the spectator-specific logic for CX/ECR.

Cryoris avatar May 13 '25 21:05 Cryoris

Re: min_joinable_duration -- this shouldn't set the amount of DD inserted, but rather control how adjacent idle blocks are joined. It's a knob that lets you trade-off between potentially many short sequences or longer blocks. The default value is picked to ensure that a layer of CX/ECR will not have a lot of short DD sequences just due to the difference in duration. But this is not discussed in the paper afaik and I'm happy to change this if you think there's a more reasonable choice 🙂

The reason I treated this is as a knob to control the amount of DD is because of this check:

if (
    node.op.name == "delay"
    and not is_after_reset(node)
    and self._duration(node, qubit_map) > self._min_joinable_duration
   )

in _get_sorted_delays. It skips durations shorter than min_joinable_duration when collecting eligible delays, effectively becoming a filter of idle blocks based on their lengths. This is because shorter delays are not just omitted from adjacency blocks and have their own DD sequences, they are skipped altogether. If we want a truly min_joinable_duration behavior in the "joinable" sense, we'll probably want to handle the parameter in _collect_adjacent_delay_blocks. As for the value default value itself, I see the reasoning for layers so I think it's OK. But now I think that a name that better reflects how the logic is implemented currently is min_duration 😄

eliarbel avatar May 15 '25 07:05 eliarbel

Fair point, let's rename it 👍🏻

Cryoris avatar May 15 '25 11:05 Cryoris