pingouin icon indicating copy to clipboard operation
pingouin copied to clipboard

Partial Correlation gives unexpected output for toy example

Open PascalIversen opened this issue 6 months ago • 4 comments

Hi, thanks for this great library! I am getting perfect correlation for the following toy example. I expected ~zero correlation as in the regression approach.

import numpy as np
import pandas as pd
import statsmodels.api as sm
import pingouin as pg
from scipy.stats import pearsonr

n = 10000
y = list(range(1, n+1))
x = y + np.random.normal(size=n)*0.1
z = y 
df = pd.DataFrame({'x': x, 'y': y, 'z': z})
print(pg.partial_corr(data=df, x='x', y='y', covar=['z']))


# Regress x on z and u and get residuals
X_with_const = sm.add_constant(np.column_stack([z]))  # Add a constant and include both z and u
model_X = sm.OLS(x, X_with_const).fit()
residuals_X = model_X.resid

# Regress y on z and u and get residuals
model_Y = sm.OLS(y, X_with_const).fit()
residuals_Y = model_Y.resid

#  Compute correlation of residuals
residual_corr, p = pearsonr(residuals_X, residuals_Y)
print(f'Partial correlation using statsmodels: {residual_corr}, {p}')

Output:

             n    r       CI95%  p-val
pearson  10000  1.0  [1.0, 1.0]    0.0
Partial correlation using statsmodels: 0.0012024407422241278, 0.9043016773480718
 pingouin.__version__
'0.5.4'

PascalIversen avatar Aug 22 '24 09:08 PascalIversen