tqec Implement Y basis initialization/measurement block

This PR implements the inplace Y-basis initialization and measurement block. Closes #548.

Circuit Construction

Craig’s original circuit is designed under the fixed-bulk convention with an X-top boundary. The memory round uses S-shaped CX ordering for X stabilizers and N-shaped ordering for Z stabilizers. At first, I thought we could simply rotate and reflect the circuit to obtain versions for all conventions and boundary orientations. However, it turns out that the CX ordering in the memory round before/after the Y-basis block matters: the interaction ordering within the Y-basis block must be adjusted accordingly to avoid tricky space/time/spacetime-like errors that would reduce the circuit distance.

The challenging part of the circuit is the transition round, which transforms the code patch between a normal surface code patch and a degenerate patch with no encoded qubit. Careful design of the diagonal twist line, the domain wall orientation, and the interaction ordering is needed to achieve good logical performance. I do not yet have a systematic method for designing such circuits, but by following the pattern of the original circuit, I have worked out correct versions for all conventions and boundary orientations (three variants in total, since two conventions share the same circuit with a Z-top boundary). Key points include:

Use reverse ordering of the memory round during the transition round.
Ensure the circuit realizes the walk-then-merge operation by adjusting certain interactions in the transition round. This creates the expected domain wall and twist line topologically.
If circuit distance decreases, try using the other diagonal for the twist line, and choose different corners for the single MY operation to avoid near-twist spacetime-like errors.

Here are the final Crumble circuits for a Y-basis initialize-then-measure experiment:

Implementation

Initially, I tried to use the existing RawCircuitLayer to implement the Y-basis block within the current architecture. However, I found it difficult to align and merge layers with symbolic linear expressions of k. For example, the Y-basis block has 1 + k + 1 layers, while regular cubes have 1 + (2k - 1) + 1 layers. To handle this, we would need to unroll the repetition layers depending on the value of k, then align the layers at either the head or the tail for merging. On top of that, significant changes would be required to support detector annotations in RawCircuitLayer, since the current approach is based on template and plaquette concepts. While it still seems possible to implement the Y-basis block with this layered approach, it would require too much work.

The current implementation of Y-basis blocks is spacetime-oriented. In this design, a YHalfCube can only connect temporally to other cubes and is spatially distinct from surrounding cubes. This makes it natural to construct the circuits for the main computation and the “temporally injected” Y-basis blocks separately, then align and merge them at the circuit level. The bulk of detectors is built independently for each circuit; we only need to add the missing detectors at the injection interface during merging. With this in mind, I introduced a new block type called InjectedBlock, which includes a factory callable that generates both the Y-basis block circuit and its detector-flow interface. The injected blocks are attached to the LayerTree. After building the main circuit from the LayerTree, I iterate through the injected blocks, aligning and merging their circuits layer by layer. The missing detectors are then computed from the flow interface and added.

Circuit alignment and merging is the most challenging part. We need to carefully track flows and measurement records to correctly update existing detector and observable definitions. I relied heavily on the gen package, which I used to convert stim circuits into gen.Layers (with layer type information) and to merge two gen.LayerCircuits using a simple merging strategy. The current implementation has only minimal checks on circuit structure and may not work reliably for more generic injected circuits. However, it works for the specific Y-basis block circuits tested so far.

Since I used gen extensively during both construction and merging, I also ran into a few small issues with the package that required modifying its source code. For this reason, I created a local copy of the repository, clarifying the license and documenting my modifications in the README. In the long run, it would be preferable to re-implement the most useful functionality from gen directly in TQEC, but this was the fastest way to get things working.

So far, I have only tested compilation with Y-basis blocks in a simple Y-memory experiment and in an S-gate teleportation experiment. A working example of the S-gate has been added to the gallery in the docs. There may still be bugs for more complex computational structures that I haven’t tested yet. We will also need more meaningful test cases involving Y-basis blocks. Additionally, the implementation is not optimized for performance, and there is plenty of room for speed improvements. For example, all the circuits are flattened (unloop) during circuit merging for simplicity while for a larger computation most layers require no flatten operation at all. That said, the feature is already usable under this feature branch.

References

Inplace Access to the Surface Code Y Basis

Oct 03 '25 12:10 inmzhang

This is amazing, thank you Yiming.

I can review this PR over the next week because I attempted the same block and am familiar with the gen repo. Others can and should take a look too. Some questions based on your initial comment:

Craig’s original circuit is designed under the fixed-bulk convention with an X-top boundary. I have worked out correct versions for all conventions and boundary orientations (three variants in total, since two conventions share the same circuit with a Z-top boundary).

What does it mean for a circuit to have an X- or Z-top boundary? From your Crumble links, it looks like this refers to the orientation of the logical boundaries in the 2D spatial plane before introducing the domain wall. However, I do not see how this explains why the transition round for a Z-top boundary is identical across both conventions, while for an X-top boundary it depends on the convention.

The current implementation of Y-basis blocks is spacetime-oriented. In this design, a YHalfCube can only connect temporally to other cubes and is spatially distinct from surrounding cubes. This makes it natural to construct the circuits for the main computation and the “temporally injected” Y-basis blocks separately, then align and merge them at the circuit level. With this in mind, I introduced a new block type called InjectedBlock, which includes a factory callable that generates both the Y-basis block circuit and its detector-flow interface. The injected blocks are attached to the LayerTree. After building the main circuit from the LayerTree, I iterate through the injected blocks, aligning and merging their circuits layer by layer. The missing detectors are then computed from the flow interface and added.

Would you recommend a using an InjectedBlock for walking codes #491? I believe it makes sense for a slide or glide taking 2d rounds to also be temporally injected.

Additionally, the implementation is not optimized for performance, and there is plenty of room for speed improvements. For example, all the circuits are flattened (unloop) during circuit merging for simplicity while for a larger computation most layers require no flatten operation at all.

Where can I find this flattening in your changed files?

Oct 03 '25 15:10 KabirDubey

Thanks Kabir,

What does it mean for a circuit to have an X- or Z-top boundary?

I'm referring to the top/bottom boundary basis of regular surface code patch before transition to the degenerate patch.

However, I do not see how this explains why the transition round for a Z-top boundary is identical across both conventions, while for an X-top boundary it depends on the convention.

If you look at the stabilizers of Z-top boundary surface code, you will find that they are exactly the same for the two conventions. They are the same at the code level, not only for circuit level.

Would you recommend a using an InjectedBlock for walking codes https://github.com/tqec/tqec/issues/491? I believe it makes sense for a slide or glide taking 2d rounds to also be temporally injected.

Actually I still don't see direct block injection as a elegant way to implement even Y-basis blocks because of many hack code it requires for the circuit alignment and merging. If there is good idea, I would like to extend and implement them in the layered approach. The problem of implementing walking code as a injected block may be that it can also connects to other blocks in space and need a well defined flow interface in space direction, which is not included in a temporally injected block. Correct me if I'm wrong with this point.

Where can I find this flattening in your changed files?

In src/tqec/compile/tree/injection.py and you can search for flattened, there are several places.

Oct 04 '25 02:10 inmzhang

The docs build failed because I replaced the values of PauliBasis enum with uppercase "X"/"Y"/"Z" and the deserialization of saved detector database will raise exception now. When I tested locally, I simply removed the database file and rebuilt it. What's the preferred way to introduce the breaking changes to detector database, should I remove the file or bump the database version? Any ideas @BSchelpe?

Oct 04 '25 02:10 inmzhang

The docs build failed because I replaced the values of PauliBasis enum with uppercase "X"/"Y"/"Z" and the deserialization of saved detector database will raise exception now. When I tested locally, I simply removed the database file and rebuilt it. What's the preferred way to introduce the breaking changes to detector database, should I remove the file or bump the database version? Any ideas @BSchelpe?

Bumping the database version is the right thing to do, thanks! It means that if you're using the default database it will automatically rebuild next time you run the code, but if it is a custom database you will have to manually delete and rebuild the file. (At least this was the behaviour last time I touched the code, I don't know whether @Zhaoyilunnn might have made any changes to this part of the behaviour when he was looking at parallelisation/IO?)

My only question is whether your change should bump the major or the minor version number. @nelimee' s suggestion in the function documentation is:

MAJOR when the format of the file changes (i.e. when the attributes of DetectorDatabase change),
MINOR when the content of the database is invalidated (e.g. by changing a plaquette implementation without changing its name).

If I've understood correctly what your change is (which I might not have!) I think yours counts as a minor change and so the number should increase to 1.1.0?

Oct 04 '25 09:10 BSchelpe

Bumping the database version is the right thing to do, thanks! It means that if you're using the default database it will automatically rebuild next time you run the code

I did not know this was possible. How does one bump the database version? I have been deleting the database (similar to Yiming).

Oct 04 '25 11:10 purva-thakre

Bumping the database version is the right thing to do, thanks! It means that if you're using the default database it will automatically rebuild next time you run the code

I did not know this was possible. How does one bump the database version? I have been deleting the database (similar to Yiming).

If you look at Yiming's last commit (26699d9) that's the place in the code to do it. It's the variable CURRENT_DATABASE_VERSION in compile/detectors/database.py that you want to increment in your local code to get this behaviour. NB this is bumping the version of the database that the code expects (and the version that the code writes). Once it has overwritten your default database with the new version number it will continue to use that database for future runs of your local code.

Oct 04 '25 13:10 BSchelpe

At least this was the behaviour last time I touched the code, I don't know whether @Zhaoyilunnn might have made any changes to this part of the behaviour when he was looking at parallelisation/IO

Thanks @BSchelpe , parallelism should not impact this behavior if I remember correctly.

Oct 04 '25 13:10 Zhaoyilunnn

At least this was the behaviour last time I touched the code, I don't know whether @Zhaoyilunnn might have made any changes to this part of the behaviour when he was looking at parallelisation/IO

Thanks @BSchelpe , parallelism should not impact this behavior if I remember correctly.

Thanks for confirming!

Oct 04 '25 16:10 BSchelpe

@KabirDubey There are a large number of conflicts in this branch. Work together with Yiming to merge #736 because either this PR or your PR is going to run into a lot of issues.

Oct 15 '25 02:10 purva-thakre

@KabirDubey There are a large number of conflicts in this branch. Work together with Yiming to merge #736 because either this PR or your PR is going to run into a lot of issues.

Thanks for noting that! I'll work on it later today.

Oct 15 '25 02:10 inmzhang

@KabirDubey

Is there a significant difference between the S gate teleportation notebook and Lao and Criger? Also note this analysis by Kwok Ho. Wasn’t sure whether to insert a cite for those ones or the original.

The state injection + gate teleportation technique is quite standard in the literature of FTQC. And it has been known before the papers you linked. If you think it's better to add a citation about it, I would recommend this one.

What is your reccommended workflow for designing these circuits? Did you need to compare raw Stim files?

I first build the entire circuit (without detectors) in Crumble, then add the observables using Pauli marks (Crumble can convert Pauli marks into observables via a keybinding). After that, I save the circuit to a file, automatically annotate the detectors with tqecd, and add errors to verify that the circuit’s distance matches expectations using Stim.

This process assumes I already know what the flows should look like during circuit design—these determine the domain walls and twist lines. That way, I can confirm the circuit correctly implements the intended topological gate before checking finer details like interaction ordering and other subtleties.

What is a more complex or meaningful test case than a Y basis memory experiment and S gate injection? AFAICT that is all the primitives we have access to without feedforward.

For example, a computation that includes other blocks at the same Z-layer as the Y blocks, or one that has multiple Y blocks within the same Z-layer. This would more thoroughly test circuit merging and other subtle details.

I noticed that we needed to fix detector lookbacks after injection shifted the measurement indices in _TreeMeasurementTracker. From looking through your code, I think the challenge is that we cannot simultaneously both use the layers framework and gen because the flow dependencies in gen don't align with layer boundaries. Would be nice to get on the same page on what exactly are the limitations of the layer framework.

I currently don’t know how to merge the layers of Y blocks with other blocks at the same Z coordinate without knowing the exact value of k. The layered approach requires instantiating layer objects using symbolic k, which appears in the repetition layers. However, merging 1+k+1 layers with 1+(2k-1)+1 layers and aligning their heads and tails requires the actual value of k. One possible workaround might be to introduce a lazy mechanism for handling layer instantiation and merging.
If we use RawCircuitLayer to build the Y-block circuit, we can generate the circuit using gen or utils by ourselves. For detectors, we have two options: (1) Use a flow-based approach to build detectors for the Y-block circuit, which would require defining interface flows. A general layer would also need flow definitions for spatial directions. (2) Use tqecd for detector annotation, but that would require modifying the current detector annotation framework to handle raw circuits.
It’s also possible to define the Y block using template- and plaquette-based layers, though this would require more work. Once the first issue is solved, this approach should integrate smoothly into the existing layered framework.

I still see the injected approach as a quick way to get it work. In the future I hope we can re-implement it and maybe get rid of the dependency of gen as well.

Oct 15 '25 07:10 inmzhang

What is a more complex or meaningful test case than a Y basis memory experiment and S gate injection? AFAICT that is all the primitives we have access to without feedforward.

For example, a computation that includes other blocks at the same Z-layer as the Y blocks, or one that has multiple Y blocks within the same Z-layer. This would more thoroughly test circuit merging and other subtle details.

I noticed that we needed to fix detector lookbacks after injection shifted the measurement indices in _TreeMeasurementTracker. From looking through your code, I think the challenge is that we cannot simultaneously both use the layers framework and gen because the flow dependencies in gen don't align with layer boundaries. Would be nice to get on the same page on what exactly are the limitations of the layer framework.
1. I currently don’t know how to merge the layers of Y blocks with other blocks at the same Z coordinate without knowing the exact value of `k`. The layered approach requires instantiating layer objects using symbolic `k`, which appears in the repetition layers. However, merging `1+k+1` layers with `1+(2k-1)+1` layers and aligning their heads and tails requires the actual value of `k`. One possible workaround might be to introduce a lazy mechanism for handling layer instantiation and merging.

2. If we use `RawCircuitLayer` to build the Y-block circuit, we can generate the circuit using `gen` or utils by ourselves. For detectors, we have two options:
   (1) Use a flow-based approach to build detectors for the Y-block circuit, which would require defining interface flows. A general layer would also need flow definitions for spatial directions.
   (2) Use `tqecd` for detector annotation, but that would require modifying the current detector annotation framework to handle raw circuits.

3. It’s also possible to define the Y block using template- and plaquette-based layers, though this would require more work. Once the first issue is solved, this approach should integrate smoothly into the existing layered framework.
I still see the injected approach as a quick way to get it work. In the future I hope we can re-implement it and maybe get rid of the dependency of gen as well.

We can make follow-up issues regarding the above when this gets merged.

Oct 15 '25 22:10 KabirDubey

LGTM. My understanding is that once Yiming approves my PR (#736), the plan is for someone to first merge my branch kd/updates-on-top-of-719 into Yiming’s feat/y-basis-block, and then merge that combined branch into main. I verified that the GitHub Action checks pass for both branches, but since I don’t have much experience with Actions, please double-check that everything looks good.

I'm planning to merge your PR first, then fix some tests coverage and request a final review before merging this.

Oct 16 '25 02:10 inmzhang

@KabirDubey When checking the test coverage for y_basis.py, I noticed that the “ANTI” diagonal type is never used, and the same goes for get_new_boundary_for_basis and get_old_boundary_for_basis. While a Y-basis circuit with an “ANTI” diagonal twist line can be constructed for certain interaction orderings, it’s not needed for any of the TQEC variants. Moreover, the circuit-building logic would definitely need changes to support the ANTI diagonal case. I also found that the abstractions for twist lines, boundary regions, and geometry aren’t particularly helpful — they mainly serve as data containers and seem a bit overengineered. Therefore, I’m inclined to safely revert your changes made to y_basis.py, which would significantly reduce the code size. Are you okay with that?

Oct 16 '25 12:10 inmzhang

@KabirDubey When checking the test coverage for y_basis.py, I noticed that the “ANTI” diagonal type is never used, and the same goes for get_new_boundary_for_basis and get_old_boundary_for_basis. While a Y-basis circuit with an “ANTI” diagonal twist line can be constructed for certain interaction orderings, it’s not needed for any of the TQEC variants. Moreover, the circuit-building logic would definitely need changes to support the ANTI diagonal case. I also found that the abstractions for twist lines, boundary regions, and geometry aren’t particularly helpful — they mainly serve as data containers and seem a bit overengineered. Therefore, I’m inclined to safely revert your changes made to y_basis.py, which would significantly reduce the code size. Are you okay with that?

Yes! Sorry, I tried to come up with something that can systematically generate layers but I couldn't figure out how to specify the geometric constraints. What remained was a lot of dead code. You might be able to replace with simple helper functions taking distance, top_boundary_basis, convention as parameters but also a total reversion makes complete sense as well.

Oct 16 '25 16:10 KabirDubey

After adding more tests involving Y-basis blocks, I found additional issues. For example, the following structure compiles into a circuit whose logical observable anti-commutes with the reset, which clearly indicates a problem in circuit construction or merging.

And this structure

produces the following distances:

Convention       k          Distance       Expected
--------------------------------------------------------
Fixed-bulk       1            3                3
Fixed-bulk       2            5                5
Fixed-bulk       3            6                7
Fixed-boundary   1            2                2
Fixed-boundary   2            3                4
Fixed-boundary   3            4                5

I need more time to identify the root cause of the issue and find a fix.

Oct 17 '25 04:10 inmzhang

The last layers of the Crumble example distance 5 Y basis initialization circuit are a particular pattern of MPPs. Gidney says "the measurement process is finished by destroying the patch by measuring all of its data qubits. To maximize code distance, each data qubit is measured in the basis of its closest boundary." I haven't figured out how that implies the exact pattern we see in the circuit but I see that your Crumble circuit does not do the same, right? Do we pay the price in code distance?

Oct 20 '25 04:10 KabirDubey

The last layers of the Crumble example distance 5 Y basis initialization circuit are a particular pattern of MPPs. Gidney says "the measurement process is finished by destroying the patch by measuring all of its data qubits. To maximize code distance, each data qubit is measured in the basis of its closest boundary." I haven't figured out how that implies the exact pattern we see in the circuit but I see that your Crumble circuit does not do the same, right? Do we pay the price in code distance?

No, it’s not that issue. The data qubit measurements you mentioned correspond to the final twist line, which is orthogonal to the one constructed by the stabilizer walking operation. My circuit already accounts for that by design.

I haven’t started investigating the problem yet, but I’ll update once I have a more precise idea of its root cause.

Oct 20 '25 06:10 inmzhang

The last layers of the Crumble example distance 5 Y basis initialization circuit are a particular pattern of MPPs. Gidney says "the measurement process is finished by destroying the patch by measuring all of its data qubits. To maximize code distance, each data qubit is measured in the basis of its closest boundary." I haven't figured out how that implies the exact pattern we see in the circuit but I see that your Crumble circuit does not do the same, right? Do we pay the price in code distance?

No, it’s not that issue. The data qubit measurements you mentioned correspond to the final twist line, which is orthogonal to the one constructed by the stabilizer walking operation. My circuit already accounts for that by design.

I haven’t started investigating the problem yet, but I’ll update once I have a more precise idea of its root cause.

I'm asking generally, not as an explanation for the bug (I haven't looked into that yet either). Are you saying that your circuit implements an equivalent data qubit measurement?

Oct 20 '25 06:10 KabirDubey

The last layers of the Crumble example distance 5 Y basis initialization circuit are a particular pattern of MPPs.

MPPs are used for terminating the out flows from the initialization circuit for demonstration purpose. MPP is not a real physical operation available in superconducting qubits. In a Y-basis memory experiment, there are no MPPs. The out flows from Y basis initialization are terminated by Y basis measurement circuit instead.

Gidney says "the measurement process is finished by destroying the patch by measuring all of its data qubits. To maximize code distance, each data qubit is measured in the basis of its closest boundary."

This is under the context of Y-basis measurement (the reverse of initialization). At the final layer of Y-basis measurement, we measure (near) half of data qubits in X basis and others in Z basis to form a twist line and preserve distance as well. You can see from the Crumble links in this PR that all the circuits have implemented the data qubit reset/measurements in a similar pattern at the first/last layer.

Oct 20 '25 06:10 inmzhang

Package	Line Rate	Health
src.tqec	100%	✔
src.tqec.circuit	96%	✔
src.tqec.circuit.schedule	99%	✔
src.tqec.compile	92%	✔
src.tqec.compile.blocks	97%	✔
src.tqec.compile.blocks.layers	95%	✔
src.tqec.compile.blocks.layers.atomic	97%	✔
src.tqec.compile.blocks.layers.composed	99%	✔
src.tqec.compile.detectors	89%	✔
src.tqec.compile.observables	99%	✔
src.tqec.compile.specs	97%	✔
src.tqec.compile.specs.library	97%	✔
src.tqec.compile.specs.library.generators	98%	✔
src.tqec.compile.tree	67%	➖
src.tqec.compile.tree.annotators	84%	✔
src.tqec.computation	96%	✔
src.tqec.gallery	100%	✔
src.tqec.interop	90%	✔
src.tqec.interop.collada	94%	✔
src.tqec.interop.pyzx	85%	✔
src.tqec.interop.pyzx.synthesis	91%	✔
src.tqec.plaquette	90%	✔
src.tqec.plaquette.compilation	100%	✔
src.tqec.plaquette.compilation.passes	95%	✔
src.tqec.plaquette.compilation.passes.transformer	99%	✔
src.tqec.plaquette.rpng	95%	✔
src.tqec.plaquette.rpng.translators	97%	✔
src.tqec.post_processing	82%	✔
src.tqec.post_processing.utils	96%	✔
src.tqec.templates	94%	✔
src.tqec.utils	96%	✔
Summary	93% (8117 / 8748)	✔

Nov 24 '25 04:11 github-actions[bot]

A preview of bcad2d82ee25362166fb901f5c01bb08162928dc is uploaded and can be seen here:

✨ https://tqec.github.io/tqec/pull/719/ ✨

Changes may take a few minutes to propagate.

Nov 24 '25 04:11 github-actions[bot]

I used this block graph for debugging and noticed that some detectors include unnecessary measurements during the split step. As a result, they form hyperedges in the decoding graph, which reduces the circuit distance when searching for graphlike logical errors. You can see these hyper-detectors at the right boundary of the left logical patch in the detslice diagram.

This issue arises from calling gen.ChunkSemiAuto.solve() when we attempt to automatically solve the flows from the split round to the Y-basis measurement rounds. Internally, it calls stim.Circuit.solve_flow_measurements(), which does not guarantee that the chosen solution measurements are minimal.

In the latest commits, I applied a temporary workaround: first compute all flows from the split fragment using tqecd, then perform a one-to-one match to the expected end stabilizers—i.e., the start stabilizers required by the Y-basis transition round. After that, we call gen.ChunkSemiAuto.solve() only on the remaining unsolved flows. With this approach, the circuit distances for all the cases mentioned above now match the expected values.

However, this fix is fragile and far from elegant. Ideally, we should move toward using flow annotations for more efficient detector computation. If possible, I expect this PR to remain open for a while so we can revisit and rework it once the improved architecture is available.

Nov 24 '25 04:11 inmzhang