deequ icon indicating copy to clipboard operation
deequ copied to clipboard

Not able to pass the Constraints to addchecks method

Open skathi2629 opened this issue 5 years ago • 2 comments

Scenario: We are passing Data to ConstraintSuggestionRunner method in Amazon Deequ framework and storing the results to a DataFrame which is written to RDBMS.(sql in this scenario)

val suggestionResult = { ConstraintSuggestionRunner() .onData(csv_df) .addConstraintRules(Rules.DEFAULT) .run() }

val suggestionDataFrame = suggestionResult.constraintSuggestions.flatMap { case (column, suggestions) => suggestions.map { constraint => (column, constraint.description, constraint.codeForConstraint) } }.toSeq.toDF("Field_name", "Rule_Description","DQ_Rules")

suggestionDataFrame.write.mode(SaveMode.Overwrite).jdbc(...)

Problem Statement: The results from Auto suggestion method are written to RDBMS. Need to retrieve the DQ_Rules field from the data base and pass it to the Verification Suite. Here, the DQ_Rules includes the custom rules along with the list of selected auto suggested constraints While passing, ending up with an error. Please find the below error details:

val constraints_list = jdbcDF.select("DQ_Rules").collect().mkString("").replace("[","").replace("]","")

val _check = """Check(CheckLevel.Error, "Data validation check")""" +constraints_list

val verificationResult: VerificationResult = { VerificationSuite() .onData(csv_df) .addCheck(_check) .run() } Error: Error:(101, 17) type mismatch; found : String required: com.amazon.deequ.checks.Check .addCheck(_check)

skathi2629 avatar Jun 15 '20 14:06 skathi2629

This error message indicates that you should a Check object instead of a string.

sscdotopen avatar Jun 27 '20 06:06 sscdotopen

Hi All, any solution to the above? I to would like to store the constraints in a DB and "reuse" them and "compare" them over time

matt12eagles avatar Jan 26 '21 16:01 matt12eagles