Add ability to generate specification
All components currently implement .from_spec methods which instantiate the component from the declarative specification. We want to be able to do the reverse and construct a specification from component instances by implementing .to_spec methods.
For all the basic component types this should be fairly straightforward, e.g. a Source, View or Filter simply has to serialize it's parameters and its type. It becomes a little more difficult if we are dealing with references and variables because View.pipeline should generally not inline and serialize the entire Pipeline specification.
def to_spec(self, allow_refs=True):
"""
Converts the component to a declarative specification that can be serialized to YAML.
Whether sub-component definitions are inlined depends on the type of component,
e.g. Filter and Transform components will be inlined on a Pipeline but a Pipeline will
not be inlined on a View.
Arguments
-----------
allow_refs: boolean
Whether to allow exporting references or to inline the materialized values.
Returns
--------
Declarative specification containing the definition of this component.
"""
Goals
- We can serialize all component types individually but also a whole
DashboardorPipelinedefinition. - We can handle references and variables
- The exported specification faithfully roundtrips to an identical instance, i.e. we can go from instance -> specification -> instance and end up with an identical copy.
Sounds good! Can you explain "because View.pipeline should generally not inline and serialize the entire Pipeline specification" a bit?
Sure, the problem in general is that a full-specification depends a bit on the context you are planning to use the exported specification in. Let's say you want to export a Pipeline as a standalone thing, in that case you want the specification to include the full definition of the Source. However when you're exporting a full dashboard you don't want to inline the Source definition in the Pipeline specification because multiple Pipeline objects may reference that very same Source. Therefore we need to be able to determine when to export a reference and when to inline the full specification.
looks like it's partially implemented. Should we close this issue?
This has for the most part been implemented now so yes, I'll close.
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.