hamilton icon indicating copy to clipboard operation
hamilton copied to clipboard

Openlineage Adapter

Open HamiltonRepoMigrationBot opened this issue 2 years ago • 3 comments

Issue by skrawcz Tuesday Jun 21, 2022 at 21:57 GMT Originally opened as https://github.com/stitchfix/hamilton/issues/137


Is your feature request related to a problem? Please describe. Hamilton encodes a lot of metadata that lives in code. It also creates some at execution time. There are projects such as https://datahubproject.io/, https://openlineage.io/ that capture this metadata across a wide array of tooling to create a central view in a heterogenous environment. Hamilton should be able to emit metadata/executions information to them.

Describe the solution you'd like A user should be able to specify whether their Hamilton DAG should emit metadata. This should play nicely with graph adapters, e.g. spark, ray, dask.

This should use the post graph execution hook to emit open lineage information.

use case:

  • creating a data set and writing it somewhere. The adapter can then emit openlineage information about what was executed.

Comment by skrawcz Wednesday Jun 22, 2022 at 03:53 GMT


~Adding a custom source for Datahub:~

  • https://datahubproject.io/docs/metadata-ingestion/developing/
  • https://datahubproject.io/docs/metadata-ingestion/adding-source/

Comment by skrawcz Tuesday Dec 27, 2022 at 21:16 GMT


FYI - @gravesee - seems like I created this a while back for emission of metadata/usage.

Would love any help in providing a motivating use case!

Updated this issuse to be around openlineage.

skrawcz avatar Jul 18 '24 18:07 skrawcz

This is done

elijahbenizzy avatar Sep 17 '24 22:09 elijahbenizzy