evidently icon indicating copy to clipboard operation
evidently copied to clipboard

Data quality test suite saved as HTML is much bigger than data quality preset metric report (300MB vs. 3MB)

Open billlyzhaoyh opened this issue 9 months ago • 3 comments

The two files in the screenshot are generated with the code below:

print("Generating data quality report...")
data_quality_report = Report(metrics=[
    DataQualityPreset(),
])
data_quality_report.run(reference_data=df, current_data=df, column_mapping=data_column_mapping)
data_quality_report.save_html(
    os.path.join(data_profile_dir, "data_quality.html")
)
print("Data quality report generated successfully!")
print("Running data quality test suite...")
data_quality_test_suite = TestSuite(tests=[
    DataDriftTestPreset(),
    DataQualityTestPreset(),
    DataStabilityTestPreset(),
])
data_quality_test_suite.run(reference_data=df, current_data=df, column_mapping=data_column_mapping)
data_quality_test_suite.save_html(
    os.path.join(data_profile_dir, "data_quality_test.html")
)
print("Data quality test suite generated successfully!")
Screenshot 2024-05-02 at 17 15 05

What can I do to shrink the size of the HTML output from the test suite?

billlyzhaoyh avatar May 02 '24 16:05 billlyzhaoyh

Hi @billlyzhaoyh,

In the second instance (when you combine multiple Test Presets), you generate a very large number of column-level tests, compared to the first instance (where DataQualityPreset() generates summaries for all columns only once).

Many of these individual Tests have a visual render (e.g., distribution of each column), increasing the resulting HTML's size.

The solution is to create a custom Test Suite that includes the individual Tests you'd like to see, instead of combining Test Presets. https://docs.evidentlyai.com/user-guide/tests-and-reports/custom-test-suite

elenasamuylova avatar May 02 '24 16:05 elenasamuylova

Thank you for this @elenasamuylova I was trying to look up but is there any way that we can disable visual render functionality in favour of a smaller HTML?

billlyzhaoyh avatar May 03 '24 10:05 billlyzhaoyh

Hi @billlyzhaoyh, I am afraid there is no such feature currently. However, you can export the results as a JSON or Python dictionary instead: https://docs.evidentlyai.com/user-guide/tests-and-reports/run-tests#output-formats

elenasamuylova avatar May 03 '24 17:05 elenasamuylova