hyppo
hyppo copied to clipboard
CCA Incorrect Statistics
When d=1, correlation from CCA is same as absolute value of Pearsons. hyppo CCA does not return correct statistic. Similar issues with higher dimensions. I believe it has something to do with squaring of singular values.
Reproducing code example:
import numpy as np
from sklearn.cross_decomposition import CCA as CCA_sklearn
from hyppo.independence import CCA as CCA_hyppo
x = np.random.normal(size=(100, 1))
y = np.random.normal(size=(100, 1))
np.abs(np.corrcoef(x.T, y.T)[0, 1])
>>>> 0.04939641702196
cca_sklearn = CCA_sklearn(1)
np.corrcoef(*cca_sklearn.fit_transform(x, y), rowvar=False).diagonal(offset=1)[0]
>>>> 0.04939641702196002
cca_hyppo = CCA_hyppo()
cca_hyppo.statistic(x, y)
>>>> 0.0024400060146073806