polars
polars copied to clipboard
is there an equivalent for the panda's dataframe.cor?
Research
-
[X] I have searched the above polars tags on Stack Overflow for similar questions.
-
[ ] I have asked my usage related question on Stack Overflow.
Link to question on Stack Overflow
No response
Question about Polars
Pandas has a function for calculating correlation over en entire dataframe called corr(). Does polars have en equivalent for this?
Yes - https://pola-rs.github.io/polars/py-polars/html/reference/expressions/api/polars.pearson_corr.html#polars-pearson-corr
And on the dataframe as well: https://pola-rs.github.io/polars/py-polars/html/reference/dataframe/api/polars.DataFrame.pearson_corr.html#polars.DataFrame.pearson_corr
@ritchie46 Hi, I think here exits a bug, Now Pearson_corr only support matrix that row number equals col number, otherwise it will raise Exception like:
df = pl.DataFrame({"foo": [1, 2, 3, 4], "bar": [3, 2, 1, 4 ], "ham": [7, 8, 9, 4]})
df.pearson_corr()
File "1.py", line 5, in <module>
df.pearson_corr()
File "/home/appadmin/anaconda3/lib/python3.8/site-packages/polars/internals/dataframe/frame.py", line 7362, in pearson_corr
np.corrcoef(self, **kwargs),
File "<__array_function__ internals>", line 5, in corrcoef
File "/home/appadmin/anaconda3/lib/python3.8/site-packages/numpy/lib/function_base.py", line 2529, in corrcoef
c = cov(x, y, rowvar)
File "<__array_function__ internals>", line 5, in cov
File "/home/appadmin/anaconda3/lib/python3.8/site-packages/numpy/lib/function_base.py", line 2369, in cov
m = np.asarray(m)
File "/home/appadmin/anaconda3/lib/python3.8/site-packages/numpy/core/_asarray.py", line 85, in asarray
return array(a, dtype, copy=False, order=order)
ValueError: cannot copy sequence with size 4 to array axis with dimension 3
The reason is that polars DataFrame convert to Numpy failed! And My Numpy version is 1.18.5 and Polars Version is 0.16.5.
df = pl.DataFrame({"foo": [1, 2, 3, 4], "bar": [3, 2, 1, 4], "ham": [7, 8, 9, 4]})
np.asarray(df)# failed!