dbt-core icon indicating copy to clipboard operation
dbt-core copied to clipboard

Add friendlier error messages when packages yml is malformed

Open jared-rimmer opened this issue 3 years ago • 5 comments

resolves #5486

Description

  • Added __post_init__ method to RegistryPackage dataclass that checks that a version has been provided and that the name of the package contains a "/".
  • Added corresponding tests to check the error messages stated in the issue are returned to the user.

Checklist

  • [X] I have read the contributing guide and understand what's expected of me
  • [X] I have signed the CLA
  • [X] I have run this code in development and it appears to resolve the stated issue
  • [X] This PR includes tests, or tests are not required/relevant for this PR
  • [X] I have opened an issue to add/update docs, or docs changes are not required/relevant for this PR
  • [X] I have run changie new to create a changelog entry

jared-rimmer avatar Sep 12 '22 11:09 jared-rimmer

Failures look like connection errors that I can't re-run and resolve

jared-rimmer avatar Sep 12 '22 13:09 jared-rimmer

@gshank would you prefer it to be done using the validate method?

jared-rimmer avatar Sep 20 '22 13:09 jared-rimmer

It would probably be better, because when we add more validation it will probably be done in 'validate' methods. If it takes too much time, we could defer it until later, but give it a try?

gshank avatar Sep 20 '22 13:09 gshank

No worries happy to give it a try! I've had a look at the logic for the validate method and finding it quite hard to understand / find a pattern in the implementation.

Am I correct in thinking that I should add a call to validate in the __post_init__ of the PackageConfig class?

jared-rimmer avatar Sep 21 '22 11:09 jared-rimmer

I had a go at this by adding a validate classmethod to PackageConfig:

@dataclass
class PackageConfig(dbtClassMixin, Replaceable):
    packages: List[PackageSpec]

    @classmethod
    def validate(cls, data):
        super().validate(data)

With a simple test to see what happens when the validate method is called:

    def test_dependency_resolution_allow_prerelease(self):
        
        packages_data = {"packages": [{'package': 'dbt-labs-test/b', 'version': None}]}
        
        a = PackageConfig.validate(data=packages_data)

Which as expected throws a ValidationError:

  @classmethod
  def validate(cls, data: Any):
      h_cls = cast(Hashable, cls)
      schema = _validate_schema(h_cls)
      validator = jsonschema.Draft7Validator(schema)
      error = next(iter(validator.iter_errors(data)), None)
      if error is not None:
>           raise ValidationError.create_from(error) from error
E           hologram.ValidationError: {'package': 'dbt-labs-test/b', 'version': None} is not valid under any of the given schemas
E           
E           Failed validating 'oneOf' in schema['properties']['packages']['items']:
E               {'oneOf': [{'$ref': '#/definitions/LocalPackage'},
E                          {'$ref': '#/definitions/GitPackage'},
E                          {'$ref': '#/definitions/RegistryPackage'}]}
E           
E           On instance['packages'][0]:
E               {'package': 'dbt-labs-test/b', 'version': None}

Not sure if this can be deemed a friendlier error message if a validate method is called when a PackageConfig is passed?

jared-rimmer avatar Sep 22 '22 14:09 jared-rimmer

@jared-rimmer thanks for your patience!

Only adding a validation method with any additional checks continues the same validation it was doing before. We generally add the validation method to add additional validation checks.

    @classmethod
    def validate(cls, data):
        super().validate(data)

If we combine what you had in the post_init in the validation method we get this. We'll want to do super.validate() at the end so we get a chance to catch those validation errors and give nicer messages first. I also tweaked your messages a bit to give some more information.

@dataclass
class PackageConfig(dbtClassMixin, Replaceable):
    packages: List[PackageSpec]

    @classmethod
    def validate(cls, data):
        for package in data["packages"]:
            if not package["version"]:
                raise ValidationError(
                    f"{data['package']} is missing the version.  When installing from the Hub "
                    "package index, version is a required property"
                )

            if "/" not in data["package"]:
                raise ValidationError(
                    f"{data['package']} was not found in the package index. Packages on the index "
                    "require a namespace, e.g dbt-labs/dbt_utils"
                )
        super().validate(data)

Let me know if you have any questions!

emmyoop avatar Oct 06 '22 12:10 emmyoop