Make it easier to efficiently construct large circuits from lists of operations
Is your feature request related to a use case or problem? Please explain
Say that I have a list of already created Operations that I want to make into a circuit. I know of at least three different ways of doing that, all with different performances.
import cirq
n_ops_per_moment = 100
n_moments = 500
def get_layer():
return [cirq.X(cirq.q(i)) for i in range(n_ops_per_moment)]
layers = [get_layer() for _ in range(n_moments)]
Option 1
circuit = cirq.Circuit(layers)
150 ms ± 1.64 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
Option 2
circuit = cirq.Circuit.from_moments(*layers)
28.3 ms ± 251 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
Option 3
circuit = cirq.Circuit.from_moments(*[cirq.Moment.from_ops(*layer) for layer in layers])
13.1 ms ± 418 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Describe the solution you would prefer
The last one is the fastest, but also the most awkward to write. At the very least, it would be nice to have a method on Circuit that makes it easier
@classmethod
def from_ops(cls, ops: Sequence[Sequence[Operation]]) -> Circuit
...
But overall it would be beneficial if there weren't large performance differences between Circuit construction methods since it makes it hard for users to do the best thing.
How urgent is this for you? Is it blocking important work?
P2 – we should do it in the next couple of quarters
Messaging so I can be assigned.
Cirq cynq - the first thing to explore is to profile Options 2 and 3 and find out why is there a 2-fold difference in time. Perhaps they can be optimized to perform equally.
Discussed in Cirq Cynq 2025-11-12:
- Should do some profiling
- Look at the difference in process of the options 2 & 3. Need to figure out what's different about the internal processing.
Option 1 left-aligns everything, so requires a bunch more calculation and rearranging. There may be some spot improvements possible, but it's been fairly well optimized.
Options 2 and 3 just load ops in order, so are much faster. The difference between 2 and 3 is that the latter calls Moment.from_ops, where as 2 calls the constructor. from_ops is documented to be faster in cases where the ops are a list/tuple and don't need flattened. https://github.com/quantumlib/Cirq/blob/main/cirq-core/cirq/circuits/moment.py#L131-L134
I think the option suggested in the issue, creating a Circuit.from_ops(ops: seq[seq[op]]) would be reasonable, in parity with Moment.from_ops.
Maybe op_tree.flatten_to_ops itself could be optimized a bit. Right now it recurses and yields every element, which may add a lot of overhead https://github.com/quantumlib/Cirq/blob/main/cirq-core/cirq/ops/op_tree.py#L72-L88. I'm not sure whether there's a faster way to implement that, but if so, that might eliminate the need for a from_ops optimization in Circuit and Moment.