hamilton icon indicating copy to clipboard operation
hamilton copied to clipboard

Allow registry of data validators for arguments

Open elijahbenizzy opened this issue 2 years ago • 1 comments

Is your feature request related to a problem? Please describe. Currently you have to use check_output.custom. But, it should be as simple as adding to this list: https://github.com/DAGWorks-Inc/hamilton/blob/b207db71a79c12413b75f981277d813a82f1c89d/hamilton/data_quality/default_validators.py#L399.

We should do the same thing we do for data adapters for data validators, and change the registry to allow it.

Describe the solution you'd like

class MyCustomValidator(BaseDefaultDataValidator):
    ...

from hamilton.registry import register_validator
register_validator(MyCustomValidator)

Additional context Add any other context or screenshots about the feature request here.

elijahbenizzy avatar Nov 29 '23 19:11 elijahbenizzy

See if this approach to add to the default validator list.

# hamilton/plugins/my_validator.py

from hamilton.data_quality import base, default_validators

class CustomValidator(base.DataValidator):
  def __init__(self, schema: ..., importance: str):
    super(CustomValidator, self).__init__(importance)
    self.schema = schema

  @classmethod
  def arg(cls) -> str:
    return "schema"
    
  # ...

def register_validators():
    default_validators.AVAILABLE_DEFAULT_VALIDATORS.append(CustomValidator)

register_validators()

then, the decorator is available through @check_output(schema=...)

However, I don't know how the validators are registered and the loading order.

Notes

  • we could then move the pandera validators from hamilton/data_quality/pandera_validators.py to hamilton/plugins/pandera_extensions.py

zilto avatar Mar 04 '24 15:03 zilto