pingouin icon indicating copy to clipboard operation
pingouin copied to clipboard

Naming of return-values for T-tests etc

Open thomas-haslwanter opened this issue 2 years ago • 6 comments

some of the names of the returned dataframe for the function result = pg.ttest(data) are unfortunately named, since it does not comply with the naming conventions for variables. As a result, some parameters can be read out with e.g. result.dof while those with the non-compliant names can only be accessed with square brackets: result['p-value']

It would be highly desirable to have names that comply with the Python conventions and requirements for variable names.

thomas-haslwanter avatar Nov 02 '21 14:11 thomas-haslwanter

Thanks @thomas-haslwanter, I agree and I've been wanting to fix this in the next release. We simply need to replace all the "-" in variable names with a "_", e.g. "p-unc" -> "p_unc".

I'll implement that in the next release,

Thanks, Raphael

raphaelvallat avatar Nov 02 '21 15:11 raphaelvallat

Don't forget to change the 'CI95%' to 'CI95', since the "%"-sign also causes problems.

thomas-haslwanter avatar Nov 02 '21 20:11 thomas-haslwanter

And one more thing here: parameters that return a single float should not be returned as a pandas Series object, but simply as a float. For example, the p-value of the test result = pg.ttest(before, after) currently has to be retrieved as result['p-value']['T-test'] This should be simplified to result.pval

thomas-haslwanter avatar Nov 10 '21 11:11 thomas-haslwanter

Hi @thomas-haslwanter,

This would mean returning the output of most Pingouin functions as a pandas.Series instead of a pandas.DataFrame. While it would be indeed simpler to access the value, I think that the Series output in Jupyter notebook is less easy-to-read than a traditional DataFrame. This is quite a big conceptual modification, so we should discuss that in a separate issue and maybe do a poll.

Thanks, Raphael

raphaelvallat avatar Nov 11 '21 00:11 raphaelvallat

When thinking it though I actually agree with you: using a pd.DataFrame for the result really makes the results MUCH clearer, and should therefore be kept. Either way, thank you for the quick reply!

thomas-haslwanter avatar Nov 11 '21 09:11 thomas-haslwanter

Edit: this will not be included in the next release of Pingouin (v0.5.1) which is a minor release, but it should be included in the next major release (0.6.0).

raphaelvallat avatar Feb 12 '22 18:02 raphaelvallat