scikit-posthocs icon indicating copy to clipboard operation
scikit-posthocs copied to clipboard

Structure for the tests

Open raamana opened this issue 5 years ago • 10 comments

Hi Maksim!

Nice package, thanks for sharing! I am trying to use it and hoping to contribute.

Before I use it, I'd like to understand how it is tested. For example, if you look at https://github.com/maximtrp/scikit-posthocs/blob/4709cf2821ce98dffef542b9b916f7c4a5f00ff4/tests/test_posthocs.py#L27

or any other tests, you seem have to pre-selected the outputs/results to compare.. Where do they come from? From the PMCMR R-package? I tried to look into the PMCMR package to look at their tests - I can't see them in the package distributed, do you know if it is tested?

What are the other principles behind the tests you wrote?

raamana avatar Aug 18 '18 09:08 raamana

Hey! I am using PMCMRplus as a reference package and two datasets (seaborn's exercise, a custom one for block design data, and one more for Tukey HSD) that are initialized in lines #11, #12, and #139. As can be seen in the code, pulse and kind columns of exercise dataset are used as a value and group column, respectively. Block dataset is converted to a DataFrame (rows are blocks, and cols are treatments or groups). The results, you have noticed to be pre-defined, were obtained with PMCMRplus.

You are welcome to make pull requests!

maximtrp avatar Aug 18 '18 10:08 maximtrp

Thanks - I tried to check into the source of PMCMRplus and PMCMR, and they do not seem to contain any tests at all. Testing against non-tested package doesn't give me comfort.

I am trying to think of developing some ground truth tests.. Let me know if you have thought about it already, and have some ideas.

raamana avatar Aug 18 '18 10:08 raamana

PMCMRplus contains references to scientific articles as well as sample datasets from literature (such as Sachs, 1997). It would be nice to move scikit-posthocs tests to these data. I do not have much time to do it now, though. If you can help, it would be great.

maximtrp avatar Aug 18 '18 10:08 maximtrp

I see.. Happy to help, but its not clear if the R package is tested thoroughly. will keep you posted.

raamana avatar Aug 18 '18 10:08 raamana

We can look through the articles and take the results provided by researchers as references for tests. It is a huge work, though

maximtrp avatar Aug 18 '18 10:08 maximtrp

I know.. I guess we can start with some sanity checks, and work our way towards more complicated tests. I will open another PR soon.

raamana avatar Aug 18 '18 11:08 raamana

Thank you for your contribution

maximtrp avatar Aug 18 '18 11:08 maximtrp

Take a look at some super-basic likely-wrong initial attempt: https://github.com/raamana/scikit-posthocs/blob/cb06d223b88a8d4d8fec949498a89f9141131009/tests/test_properties_posthocs.py#L50

raamana avatar Aug 18 '18 18:08 raamana

Thank you. Actually, the problem we should deal with first is correctness of implementation. Your code tries to check a test by playing with type I errors (false positives), which are related to statistical value of a test, but not its implementation quality. Though, this is not absolutely wrong, and it may be kept as a fallback unit test if no valid examples of a post-hoc test usage are found. Also, we should use an acceptable and established way to create random samples.

Here are some steps I have summarized:

  1. Check if a test is implemented correctly (we have to find a validated example, probably provided by authors, or verify code by comparing it to the mathematics given in a scientific source).
  2. Check code performance and optimize it.

maximtrp avatar Aug 20 '18 03:08 maximtrp

Hi! I agree with you on the motivation. That’s definitely the long term plan - what I provided is a first pass of a rough draft attempting to get there.. at the end, we must have multiple of types of tests, both unit and integration tests (against various expected behaviours), as well as for specific properties. what you propose takes lot of effort, and I’d be happy to contribute when I can.

Speed isn’t a concern for me typically, unless it’s too slow to be a dealbreaker. I try to get the implementation correctly first, and then if the speed is not satisfactory, we can certainly try to profile the code and speed it up. I wouldn't worry about it all unless we are forced to.

raamana avatar Aug 24 '18 11:08 raamana