pingouin
pingouin copied to clipboard
Naming of return-values for T-tests etc
some of the names of the returned dataframe for the function result = pg.ttest(data)
are unfortunately named, since it does not comply with the naming conventions for variables.
As a result, some parameters can be read out with e.g.
result.dof
while those with the non-compliant names can only be accessed with square brackets:
result['p-value']
It would be highly desirable to have names that comply with the Python conventions and requirements for variable names.
Thanks @thomas-haslwanter, I agree and I've been wanting to fix this in the next release. We simply need to replace all the "-" in variable names with a "_", e.g. "p-unc" -> "p_unc".
I'll implement that in the next release,
Thanks, Raphael
Don't forget to change the 'CI95%' to 'CI95', since the "%"-sign also causes problems.
And one more thing here: parameters that return a single float should not be returned as a pandas Series object, but simply as a float.
For example, the p-value of the test
result = pg.ttest(before, after)
currently has to be retrieved as
result['p-value']['T-test']
This should be simplified to
result.pval
Hi @thomas-haslwanter,
This would mean returning the output of most Pingouin functions as a pandas.Series instead of a pandas.DataFrame. While it would be indeed simpler to access the value, I think that the Series output in Jupyter notebook is less easy-to-read than a traditional DataFrame. This is quite a big conceptual modification, so we should discuss that in a separate issue and maybe do a poll.
Thanks, Raphael
When thinking it though I actually agree with you: using a pd.DataFrame for the result really makes the results MUCH clearer, and should therefore be kept. Either way, thank you for the quick reply!
Edit: this will not be included in the next release of Pingouin (v0.5.1) which is a minor release, but it should be included in the next major release (0.6.0).