hamilton Utilize the output of another node in data quality

Is your feature request related to a problem? Please describe. Say you want to ensure your input data is the same shape as your output data. This should be easy enough, but is not currently feasible. You could work around it if you want by joining them into a single tuple and writing a custom check to see if the two items match, but that's awkward and requires the joined one to be on the blocking path.

Describe the solution you'd like TBD exactly -- will write out more later, but I think we can use source and value, defaulting to value.

@check_output(index_matches=source('input_data'))
def output_data() -> pd.DataFrame:
    # ...

Or...

@check_output.custom(CustomIndexMatcher(index_matches=source('input_data'))
def output_data() -> pd.DataFrame:
    # ...

Describe alternatives you've considered See above, but nothing that clear

Additional context In a talk with an OS user.

Jun 01 '23 14:06 elijahbenizzy

Two resources:

Pydantic passing context info to validators: https://docs.pydantic.dev/latest/usage/validators/#validation-context
Deal ensure decorator receives function inputs/ouputs: https://deal.readthedocs.io/basic/values.html#deal-ensure

Jul 03 '23 13:07 zilto

Changes one might need to make:

Change this to add any non-static dependencies
Loosen the validation here -- instead wire through the type as the validator type -- we can take the applies_to and make the parameter in (1) a union of the applicable types
Change that to return validators in a delayed manner (E.G. a function that gives them given the constructor argument).

Basically we need to push this further downstream -- we only need to know (a) which validators to build and (b) which parameters they take in to build the DAG, then we can construct them at runtime. So, probably a complex change but not too many lines of code.

Sep 27 '23 17:09 elijahbenizzy