dagster icon indicating copy to clipboard operation
dagster copied to clipboard

[RFC] exploration of lazy loading + external service integration and state-driven blueprints

Open sryza opened this issue 6 months ago • 0 comments

Summary & Motivation

This PR is an exploration of hypothetical APIs for loading definitions. It consists of two explorations:

  • A sketch of an integration (Airbyte) that pulls asset definitions from an external service
  • A sketch of a blueprints-loader that pulls blueprints from a state store

It imagines two new concepts:

  • @defs_loader – basically the same as in https://github.com/dagster-io/dagster/pull/23678. It's basically a wrapped function that accepts a DefinitionLoadContext and returns a Definitions.
  • DefinitionSource, an object that can be included on a Definitions object that represents a "source" of a set of definitions, like an Airbyte workspace, dbt project, or blueprint-based factory. It has a name and arbitrary metadata. This is useful for a couple things:
    • When loading definitions in the step worker, the defs loader can use the DefinitionsLoadContext to fetch the DefinitionSource that was returned by the code server during the corresponding "Reload definitions" operation. It can use the metadata on this cached DefinitionSource to reconstruct the definitions, rather than hitting the external service.
    • Metadata on the DefinitionSource, like blueprint schema, can power future HTTP API- and UI-based blueprints functionality.

It also imagines the ability for a SensorResult to include a DefinitionsReloadRequest, so that users don't need to use the GraphQL API to automatically reload their definitions.

How I Tested These Changes

sryza avatar Aug 20 '24 19:08 sryza