RDT icon indicating copy to clipboard operation
RDT copied to clipboard

Add attribute to transformers to check if they should be tested for quality

Open amontanez24 opened this issue 4 years ago • 1 comments

Problem Description

By definition, some transformers don't do anything to help expose or preserve relationships in the data (eg. the LabelEncodingTransformer. This means they are very likely to fail quality tests.

As of now, the thresholds for quality tests have been lowered to account for this, but what would be better is if we could just skip any transformers that are known not to be good for quality in the tests.

Proposal

  1. Add an attribute and method to each transformer to determine if it is designed to be good for quality or not.
  2. Raise the thresholds in the tests/quality/test_quality.py file.

amontanez24 avatar Oct 18 '21 19:10 amontanez24

I agree this should be revisited at some point, I just had to lower the threshold even further to allow the CategoricalTransformer to pass.

fealho avatar Feb 02 '22 22:02 fealho