qiskit
qiskit copied to clipboard
Reorganize circuit elements class hierarchy
The things that can be added in a circuit (let's call them CircuitElement
for now) are growing a lot and some reorganization is needed. This discussion issues is to talk about that reoganization. I'll try to keep the first comment updated as the discussion progress. Also, if you have the privileges, you can edit this directly.
CircuitElement
Root abstract class.
- attributes:
- name
- methods:
- append
OpaqueInstruction(CircuitElement)
Not reversible.
Reset(OpaqueInstruction)
Instruction(OpaqueInstruction)
Circuit(CircuitElement)
Implements current circuit. Not necessarily reversible.
Gate(Circuit)
Implements current gate. Reversible.
- Methods:
-
to_matrix
-
control
-
ControlledGate(Gate)
Directive(CircuitElement)
Barrier(Directive)
Thanks @1ucian0 for starting the discussion on this. This has been a long while coming, and what's above is a good description of the structure where things currently stand.
A good place to start the discussion is collecting the use cases and behaviors we want to support, and designing roles/mixins around those (similar to what's been done recently with Operator
s). Then we can focus our design on how Terra expects to consume these objects without being too prescriptive on how they are implemented or extended by users.
For basic instruction on the circuit:
- How should this operation be drawn?
- How many qubits/clbits does this operation input/output?
- How many parameters does it accept (names, types, ranges)?
- ...
For circuit building:
- Is this operation invertible (and how)?
- Is this operation broadcastable (and how)?
- Can this operation be controlled?
- Can this operation be conditioned (classically)?
- ...
Serialization:
- How is this operation serialized to QASM2?
- Qobj?
- QASM3?
- ...
Transpilation:
- Does this operation have a unitary description? An efficiently obtainable one?
- Can this operation be unrolled into one or more other operations?
- Does this operation describe a logical operation, or a physical one (implemented by a backend, with noise, constraints)?
- ...
And likely there are more that I missed.
Along with this redesign would need to come a change in the transpiler to by default, leave in place, operations it does not know how to operate over.
Likewise, it'd be good to collect a set of use cases that we know are challenging with our current structure, to see that they're better supported in the future.
The two that immediately come to mind are:
- Backends looking to define and expose new instructions, that will be inserted/drawn/assembled but the transpiler will either a) know how to optimize to or b) know to ignore ( #5701 ).
- Being able to distinguish operations with a known description, from ones with a known implementation, e.g. an operator with a unitary matrix/Clifford table description (without requiring pre-specified synthesis like exists now for UnitaryGate, isometries, etc.)
some earlier related issues: https://github.com/Qiskit/qiskit-terra/issues/4700, https://github.com/Qiskit/qiskit-terra/issues/3800 also i made a comment about class names here:https://github.com/Qiskit/qiskit-terra/issues/5774
After some renewed interest from the ongoing work in https://github.com/Qiskit/qiskit-terra/labels/synthesis , it looks like there are two issues at play here:
- organization of the circuit library and
Instruction
subclasses (#4011 , #4520 , #5701 , #6883 , #6531, and others), and - standardization of an interface for high-level operations (from within terra and without) to be stored on the circuit without requiring them to be synthesized ahead-of-time (to support the decoupling of an operation's semantic/behavioral representation from its structural implementation, as described in #4700 for e.g. Cliffords, unitaries, arithmetic circuits, ..., with renewed interest for #6975, and #6984).
These can be addressed independently, and so the below focuses only on the latter.
Currently, higher-level operations can be appended to circuits in one of two ways:
- synthesis of the higher-level operation to a composite
Instruction
upon insertion (e.g. forClifford
s, https://github.com/Qiskit/qiskit-terra/blob/8cb3a1ad802ffc76753190ef9d03522f2ba63b53/qiskit/quantum_info/operators/symplectic/clifford.py#L373), or - storing the higher-level operation as a custom attribute of an
Instruction
subclass, with a_define
method that includes a default synthesis routing (e.g. forUnitaryGate
https://github.com/Qiskit/qiskit-terra/blob/8cb3a1ad802ffc76753190ef9d03522f2ba63b53/qiskit/extensions/unitary.py#L59 or forClassicalFuction
https://github.com/Qiskit/qiskit-terra/blob/8cb3a1ad802ffc76753190ef9d03522f2ba63b53/qiskit/circuit/classicalfunction/classicalfunction.py#L46 )
From the perspective of #4700, the former has limitations in that it:
- causes circuit construction to become costly (as all high-level operations must be synthesized at construction time),
- limits the ability of transpiler to select a synthesis method (where it may have compile-time information about gate fidelities, local qubit connectivity, or the availability of auxiliary qubits that might favor one particular choice), and
- prevents the transpiler from making high-level optimizations (e.g. combining multiple adjacent arithmetic instructions).
The latter does support #4700, but with constraints that:
- there isn't a consistent interface for finding the high-level operation that underlies a given
Instruction
(so the transpiler can't trivially identify if two e.g.Clifford
s are the same, without specialized code to find the stabilizer table of aClifford
, separate from that for finding the unitary of aUnitaryGate
, ...), - higher-level operations must be wrapped in an
Instruction
subclass (this adds marginal runtime and memory overhead, but is an additional step for implementers), and - higher-level operations which define a default synthesis method (via
_define
) require some care if a non-default method is to be used to ensure the default method isn't invoked by one of the unrolling passes (see e.g. #6349).
Ameliorating the above (through design, standardization and documentation) is a viable option for supporting #4700 , and should be considered.
An alternate approach has been discussed around using mixins as a means to standardize the interface for the operations that can be placed in a quantum circuit. Mixins in python (also called traits or roles) are a limited form of multiple inheritance which define a limited interface of behaviors (not overlapping with other mixins) which any class can implement and satisfy. (More detail on the rationale behind mixin usage here: https://www.python.org/dev/peps/pep-3133/#rationale )
As compared to the Instruction
sub-class approach above, mixins:
- don't require higher-level operations to build separate
Instruction
sub-classes or to adhere to theInstruction
class hierarchy in order to be added to a circuit (Operator
s, orMultiplier
s can have their own class structure, or not, independent of that ofInstruction
), - facilitate inspection of the high-level operations by the transpiler (as the object itself is contained on the circuit),
- allow for the standardization and documentation of the interface of what's required for a given operation to be stored on a circuit (some of this is currently encoded in the
Instruction
base class, some is tribal knowledge e.g. how to have a variadic gate type).
Mixins are not a built-in python language feature, and so require some care to ensure the pattern is followed (e.g. ensure no mixin defines an __init__
, and none have overlapping property or method names). Its possible this can be addressed with lint rules, or may not arise in practice without a large number of mixins. quantum_info.operator
has used this pattern since #5617 and may have some relevant experience.
Some rough pseudocode of how a mixin pattern might look:
from abc import ABC, abstractmethod
class CircuitOperationMixin(ABC):
@property
@abstractmethod
def name(self):
"""Unique string identifier for operation type. Used by the circuit and
others to collect common operations which differ only in their operands
(e.g. for resource estimation, serialization, and visualization).
"""
raise NotImplementedError
@property
@abstractmethod
def num_qubits(self):
"""The number of quantum bit operands required by the operation (in the
circuit model, this encompasses both input and output). Used by the circuit
in tracking and ordering quantum resources e.g. for calculating of depth.
"""
raise NotImplementedError
@property
@abstractmethod
def num_clbits(self):
"""The number of classical bit operands required by the operation (in the
circuit model, this encompasses both input and output). Used by the circuit
in tracking and ordering classical resources e.g. for calculating of depth.
"""
raise NotImplementedError
@property
@abstractmethod
def num_params(self):
"""The number of python operands required by the operation. The number
and type of these operands will in general be dependent on the operation
type. These operands must be resolved prior to device execution. Used by
the circuit in binding and assignment of parameters.
"""
raise NotImplementedError
@property
@abstractmethod
def params(self):
"""Returns a python tuple of the python operands currently assigned to
this operation.
"""
raise NotImplementedError
@params.setter
@abstractmethod
def params(self):
"""Accepts a python tuple of python operands to be assigned to this operation.
Performs and necessary validation and updates to internal data structures.
"""
raise NotImplementedError
...
class Clifford(BaseOperator, AdjointMixin, CircuitOperationMixin):
...
@property
def name(self):
return 'clifford'
# implements @property num_qubits through BaseOperator
@property
def num_params(self):
return 1
@property
def num_clbits(self):
return 0
@property
def params(self):
return (self._table,)
class Instruction(CircuitOperationMixin): # Otherwise unchanged
...
Classes which implement the CircuitOperationMixin
interface are valid to add to a QuantumCircuit
, and so QuantumCircuit.append
would be updated to instead check isinstance(instruction, CircuitOperationMixin)
and we could deprecate and remove:
if not isinstance(instruction, Instruction) and hasattr(instruction, "to_instruction"):
instruction = instruction.to_instruction()
Mixins generally imply multiple inheritance, and multiple inheritance doesn't play nicely with __slots__
in the general case. We construct a lot of circuit element objects, and so for memory and performance reasons, it's probably going to be better for us to have slots on them all; they'll be faster to initialise, take less memory, and operations on them are faster. I actually was interested in removing the mixins from quantum_info
, in the interests of speeding up various transpiler passes that need to construct many Operator
instances, since the initialisation is slow and memory-hungry in the current form.
I'd want to avoid having Instruction
and Operator
having any ABC
instantiation checks in the classes if at all possible as well. Perhaps we could manage these properties through a metaclass, which takes a bunch of API definition classes as arguments, then defines any necessary __slots__
, and checks that all necessary methods are defined at class-definition time, rather than at object instantiation time?
I'd imagine we'd end up with initialisers that looked like
class XGate(metaclass=TraitManager([CircuitOperation, SelfInverse, SelfAdjoint])):
...
class Clifford(OperatorBase, metaclass=TraitManager([CircuitOperation, Adjoint, Inverse])):
...
or so. It's not immediately clear to me when we'd use regular inheritance (Clifford(OperatorBase)
v XGate()
) in conjunction with any managed trait system, but the metaclass machinery can access the bases anyway, so that isn't a limitation. The TraitManager
metaclass can enforce that all abstract methods are defined at class initialisation time, so we can avoid all of abc.ABC
's costly intialisation checks, but still get the benefits with regards to safety.
I have the rough sketch of how to do this in my head, but I don't want to commit it to code here in the issue, because I imagine there's a bunch of rough edges (subclass instantation, error handling, ergonomics of implementing the "does this object have this trait?" checks) that I'd need to think about more to do properly. I can imagine nice ways to handle all of them, but it'd need some testing to be sure.
For the caveats with using __slots__
and multiple inheritance, I came across https://docs.python.org/3/reference/datamodel.html#notes-on-using-slots , from which:
- The action of a slots declaration is not limited to the class where it is defined. slots declared in parents are available in child classes. However, child subclasses will get a dict and weakref unless they also define slots (which should only contain names of any additional slots).
- If a class defines a slot also defined in a base class, the instance variable defined by the base class slot is inaccessible (except by retrieving its descriptor directly from the base class). This renders the meaning of the program undefined. In the future, a check may be added to prevent this. ...
- Multiple inheritance with multiple slotted parent classes can be used, but only one parent is allowed to have attributes created by slots (the other bases must have empty slot layouts) - violations raise TypeError.
seem relevant (though I'm not exactly sure how to parse the conditions and consequences of the second bullet). The last does seem fairly incompatible with use of __slots__
within both the object classes and mixins. (As a potential near term workaround, if the child class explicitly defines all the __slots__
of its parents, would this sidestep this problem?) The TraitManager
metaclass approach seems a reasonable solution to me (though agree one that would require some thinking and testing to avoid rough edges). Would we need something like a TraitManger
in-place before starting a potential move towards mixins? If not, maybe a quick POC would be worthwhile to potentially uncover any incompatibilities that might deter us from taking the mixin route.
I'd want to avoid having Instruction and Operator having any ABC instantiation checks in the classes if at all possible as well.
Can you clarify what's meant by ABC instantiation checks here? e.g. If I define a new class that's a subclass of abstract class which itself subclasses ABC
, is there an added cost on instantiating new instances of the child class?
I tried the following:
from abc import ABC, abstractmethod
class B():
def foo(self):
return 3
class Parent_abc(ABC):
@abstractmethod
def foo(self):
return 3
class Babc(Parent_abc):
def foo(self):
return 3
and got the following (python 3.6):
%%timeit
b = B()
101 ns ± 10.8 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
%%timeit
b = Babc()
103 ns ± 2.84 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
which seem comparable. Creating classes which subclass ABC
does seem slower though:
%%timeit
class Cabc(Parent_abc):
def foo(self):
return 3
23.4 µs ± 2.16 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%%timeit
class Cabc():
def foo(self):
return 3
9.89 µs ± 572 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Slots
Yeah, the last bullet point in that list from the Python docs was what I was alluding to with mixins being a nuisance with slots. I'm pretty sure it's possible to write a metaclass to construct a __slots__
class with multiple inheritance, by scanning over the derived types and building up the __slots__
field manually, before it's passed to type
to construct. Class decorators can't do that in the same way, because they require that the class is first constructed, then the bound name is modified, so you end up "intercepting" the creation at the wrong point.
You can define slots that are already defined by a parent, currently, and nothing will complain, but it's wasteful of space. It shouldn't have speed implications, except for indirect ones caused by less efficient cache use. The reason is related to that second bullet point you weren't sure about (hidden in the expander).
Derailing technical details about __slots__...
Take, for example:
In [35]: class Parent:
...: __slots__ = ('a',)
...: class Override(Parent):
...: __slots__ = ('a',)
...: class Inherit(Parent):
...: __slots__ = ()
...:
In [36]: import sys
...: sys.getsizeof(Override()), sys.getsizeof(Inherit())
Out[36]: (48, 40)
I believe this is because at the C level, space for a PyObject *
corresponding to each item defined in __slots__
is reserved on instantiation, and then the object Parent.a
is actually a descriptor (like how @property
works), which fetches the value for an instance from a fixed, known pointer offset in memory. In contrast, __dict__
lookups have an extra step, where they first need to look up the offset of a given name in a hashmap, then they check if the resulting object is a descriptor, and then they return the value.
This means that in the case here, Override.a
is actually a @property
-like getter that's distinct from Parent.a
, whereas Inherit.a
is referentially exactly Parent.a
:
In [5]: Override.a
Out[5]: <member 'a' of 'Override' objects>
In [6]: Inherit.a
Out[6]: <member 'a' of 'Parent' objects>
So now there's more space than expected on Override
objects. I can't do Override.nonexistent = None
without getting an AttributeError, but I can do
In [7]: overridden = Override()
...: overridden.a = 'child'
...: Parent.a.__set__(overridden, 'parent')
...: overridden.a
Out[7]: 'child'
In [8]: Parent.a.__get__(overridden)
Out[8]: 'parent'
where I've managed to store an extra Python object in overridden
, that just by looking at its __slots__
, I shouldn't have been able to. If I try the first cell with Inherit
, you can see that the extra space isn't there:
In [9]: inherited = Inherit()
...: inherited.a = 'child'
...: Parent.a.__set__(inherited, 'parent')
...: inherited.a
Out[9]: 'parent'
This is what's meant by "the instance variable defined by the base class slot is inaccessible (except by retrieving its descriptor directly from the base class)" - Parent.a
is the base class descriptor, and here I've used it to access a previously hidden value in overridden
.
ABC
My timing concerns on ABC are mistaken - I hadn't actually checked, and I had a false model in my head for how ABCMeta
goes about its business. I have no more concerns about using ABC
, but we do know that using @property
has non-trivial speed impacts on fetching attributes. In cases where we just want to define that an attribute exists (a member, rather than a method), we can use __slots__
instead, if we have something like a trait-manager metaclass that correctly builds the type from multiple inherited slots classes.
"TraitManager"
This actually feels to me like something that should be an external library, were we to have it. It may even already exist - I'll have a better look tomorrow, and if not, I'll have a think about writing one. Our use-case is more about static type creation, rather than dynamic trait look-up, though, so it's quite un-Pythonic (but then again, core speed/memory optimisations often are).
Perhaps I'm missing here something but it seems to me that slots and mixin are orthogonal, at least in our discussion here. The use if mixin here is to signify a role (CircuitElement) for certain (albeit many) classes, via an interface commitment. I don't see how slots, being an optimization for handling class attributes more effectively, help us with having interface and functionality commitments for circuit elements. Am I missing here something ?
If run-time efficiency is the main concern for not using mixin and preferring slots and metaclass machinery, I suggest that maybe we do sufficient benchmarking for we can wisely trade off performance with other important factors like readability and extendability.
My concern was all about not letting mixins prevent us from using __slots__
and suppressing the creation of __dict__
in downstream classes. I wanted to have both interface definitions and __slots__
, but I thought that defining attribute interfaces as property
and abstractmethod
would require those to be property
in child classes (to satisfy abc.ABC
), while actually that's not the case. Coming back to it again today, there is no reason you can't use mixins with property
and abstractmethod
to define an interface, and have each child class just manually override all the slots:
In [9]: import abc
...:
...:
...: class Mixin(abc.ABC):
...: __slots__ = ()
...:
...: @property
...: @abc.abstractmethod
...: def a(self):
...: pass
...:
...:
...: class Child(Mixin):
...: __slots__ = ('a',)
...:
...: def __init__(self):
...: self.a = 3
...:
In [10]: Child().a
Out[10]: 3
In [11]: Child().__dict__
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-12-9c959507972c> in <module>
----> 1 Child().__dict__
AttributeError: 'Child' object has no attribute '__dict__'
There's a minor downside in that the derived classes have to write all their own __slots__
definitions for attributes in the interfaces (like my Child
had to have __slots__ = ('a',)
), but that's not really the end of the world, even if there is a fair amount of duplication. We just have to ensure that all our mixins correctly set __slots__ = ()
.
There are already timing nuisances in some transpiler passes because some of the quantum_info
classes (Operator
in particular) are heavier than they need to be, slower to initialise, and we instantiate millions+ of them. The speed is in part because of how they use mixins with attributes defined in them, without the mixins having __slots__ = ()
, and Operator
doesn't define its slots. That's why I really wanted to make sure we wouldn't be introducing new problems with mixins here - I've absolutely nothing against defining interfaces (that's absolutely the right thing to do), I just wanted to make sure that it wouldn't interfere with speed/memory concerns.
I very much endorse these requirements from @kdk:
- don't require higher-level operations to build separate Instruction sub-classes or to adhere to the Instruction class hierarchy in order to be added to a circuit (Operators, or Multipliers can have their own class structure, or not, independent of that of Instruction),
- facilitate inspection of the high-level operations by the transpiler (as the object itself is contained on the circuit),
- allow for the standardization and documentation of the interface of what's required for a given operation to be stored on a circuit (some of this is currently encoded in the Instruction base class, some is tribal knowledge e.g. how to have a variadic gate type).
But I don't have strong opinions on the use of mixins vs. trait manager.
I think forget about this idea of a trait manager now - it's not necessary, and we can do things efficiently without it. I also like the interfaces method Kevin's describing there.
Is https://github.com/Qiskit/qiskit-terra/issues/4313 superseded by this one?