python-causality-handbook
python-causality-handbook copied to clipboard
[Chapter 08]: Possible mistake in "Weakness of Instruments" example — instrument causality direction
Hi!
I believe there may be a mistake in the example provided in the "Weakness of Instruments" section.
The issue is that, in the code, the generated instrumental variables are caused by the treatment T, rather than causing it. This violates a key requirement for a valid instrument: it should affect the treatment, not the other way around.
The relevant part of the code is:
np.random.seed(12)
n = 10000
X = np.random.normal(0, 2, n) # observable variable
U = np.random.normal(0, 2, n) # unobservable (omitted) variable
T = np.random.normal(1 + 0.5*U, 5, n) # treatment
Y = np.random.normal(2 + X - 0.5*U + 2*T, 5, n) # outcome
stddevs = np.linspace(0.1, 100, 50)
Zs = {f"Z_{z}": np.random.normal(T, s, n) for z, s in enumerate(stddevs)} # instruments with decreasing Cov(Z, T)
sim_data = pd.DataFrame(dict(U=U, T=T, Y=Y)).assign(**Zs)
Here, T depends on U, and Z depends on T. However, for Z to be a valid instrument, we should instead have T depending on Z.
Later in the section, it's mentioned:
"Another thing to notice is that 2SLS is biased! Even with high correlation, the parameter estimate still does not reach the true ATE of 2.0. Actually, 2.0 is not even in the 95% CI! 2SLS is only consistent, which means that it approaches the true parameter value if the sample size is big enough."
But in this case, even if the sample size is greatly increased (e.g., n=10^6), the estimated coefficient does not converge to the true value (2.0). The problem is not simply the correlation strength or sample size — the root problem is that the instrument Z is endogenous (it is a consequence of T). Another way to see it is that in this case the instrumental variables are correlated with U and to be precise they are caused by U (because they are caused by T which is caused by U).
Thus, the 2SLS estimator converges to a biased value, and the confidence intervals do not contain the true causal effect.
To illustrate this, I modified the code slightly:
np.random.seed(12)
n = 1000000
X = np.random.normal(0, 2, n) # observable variable
U = np.random.normal(0, 2, n) # unobservable variable
Z = np.random.normal(0, 2, n)
T = np.random.normal(1 + 0.5*U + 0.5*Z, 5, n) # treatment depends on Z
Y = np.random.normal(2 + X - 0.5*U + 2*T, 5, n) # outcome
stddevs = np.linspace(0.1, 10, 50)
Zs = {f"Z_{z}": np.random.normal(Z, s, n) for z, s in enumerate(stddevs)}
sim_data = pd.DataFrame(dict(U=U, T=T, Y=Y)).assign(**Zs)
Here, T is a function of Z (as required). When re-running the analysis with this setup (and using a large enough sample size, e.g., n=10^5), the 2SLS estimator now properly converges toward the true coefficient (2.0), and the true value is generally within the confidence interval. You may also check that in this case the instrumental variables are not correlated with U.
Summary: The original simulation mistakenly violates the fundamental causality direction needed for valid IVs (instrument Z must cause treatment T). This explains why the estimated coefficients fail to converge, even with large samples.
Thanks a lot for your work on this great resource — I hope this feedback helps improve it even further! I'll keep reading the rest of the book now 🚀