lumen icon indicating copy to clipboard operation
lumen copied to clipboard

Add ability to generate specification

Open philippjfr opened this issue 3 years ago • 3 comments

All components currently implement .from_spec methods which instantiate the component from the declarative specification. We want to be able to do the reverse and construct a specification from component instances by implementing .to_spec methods.

For all the basic component types this should be fairly straightforward, e.g. a Source, View or Filter simply has to serialize it's parameters and its type. It becomes a little more difficult if we are dealing with references and variables because View.pipeline should generally not inline and serialize the entire Pipeline specification.

def to_spec(self, allow_refs=True):
    """
    Converts the component to a declarative specification that can be serialized to YAML.
    Whether sub-component definitions are inlined depends on the type of component,
    e.g. Filter and Transform components will be inlined on a Pipeline but a Pipeline will
    not be inlined on a View.

    Arguments
    -----------
    allow_refs: boolean
      Whether to allow exporting references or to inline the materialized values.

    Returns
    --------
    Declarative specification containing the definition of this component.
    """

Goals

  • We can serialize all component types individually but also a whole Dashboard or Pipeline definition.
  • We can handle references and variables
  • The exported specification faithfully roundtrips to an identical instance, i.e. we can go from instance -> specification -> instance and end up with an identical copy.

philippjfr avatar Aug 30 '22 09:08 philippjfr

Sounds good! Can you explain "because View.pipeline should generally not inline and serialize the entire Pipeline specification" a bit?

jbednar avatar Aug 30 '22 15:08 jbednar

Sure, the problem in general is that a full-specification depends a bit on the context you are planning to use the exported specification in. Let's say you want to export a Pipeline as a standalone thing, in that case you want the specification to include the full definition of the Source. However when you're exporting a full dashboard you don't want to inline the Source definition in the Pipeline specification because multiple Pipeline objects may reference that very same Source. Therefore we need to be able to determine when to export a reference and when to inline the full specification.

philippjfr avatar Sep 06 '22 17:09 philippjfr

looks like it's partially implemented. Should we close this issue?

sophiamyang avatar Sep 26 '22 15:09 sophiamyang

This has for the most part been implemented now so yes, I'll close.

philippjfr avatar Nov 16 '22 13:11 philippjfr

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

github-actions[bot] avatar Jul 11 '23 06:07 github-actions[bot]