SDMetrics icon indicating copy to clipboard operation
SDMetrics copied to clipboard

Explore other statistical tests for numerical columns

Open fealho opened this issue 3 years ago • 1 comments

Currently SDMetrics only provides the two samples KS test to compare numerical values. We should consider adding other tests as an optional parameter, so the user can choose a test which better matches their understanding of their data.

Additionally, we should explore substituting the KS test with the Anderson-Darling test as the default, since it is a more powerful test overall. Such change would require experimentation to show that the AD test indeed outperforms the KS test in most use cases.

fealho avatar May 19 '22 00:05 fealho

I found these links supporting the notion that AD test outperform KS test:

  • https://www.researchgate.net/publication/276918573_Comparing_distributions_the_two-sample_Anderson-Darling_test_as_an_alternative_to_the_Kolmogorov-Smirnov_test
  • https://asaip.psu.edu/articles/beware-the-kolmogorov-smirnov-test/

mahmadza avatar Mar 28 '23 13:03 mahmadza