ivreghdfe
ivreghdfe copied to clipboard
[BUG] Incorrect fixed effects with `cluster()`
clear
sysuse auto, clear
ivreghdfe price (mpg = turn), absorb(a1=rep78) cluster(rep78)
ivreghdfe price (mpg = turn), absorb(a2=rep78) cluster(rep78)
gen diff = reldif(a1, a2)
which ivreghdfe
sum diff, d
gives
/home/mauricio/ado/plus/i/ivreghdfe.ado
*! ivreghdfe 1.1.3 04Jan2023 (bugfix for github issue #48)
*! ivreghdfe 1.1.2 29Sep2022 (bugfix for github issue #44)
*! ivreghdfe 1.1.1 14Dec2021 (experimental -margins- support)
*! ivreghdfe 1.1.0 25Feb2021
*! ivreg2 4.1.11 22Nov2019
*! authors cfb & mes
*! see end of file for version comments
diff
-------------------------------------------------------------
Percentiles Smallest
1% 0 0
5% 0 0
10% .0145922 0 Obs 74
25% .0145922 0 Sum of Wgt. 74
50% .0188667 Mean .0251712
Largest Std. Dev. .0174165
75% .0292981 .0621538
90% .0621538 .0621538 Variance .0003033
95% .0621538 .0621538 Skewness 1.148167
99% .0621538 .0621538 Kurtosis 3.469954
FYI it's different each run and I'm not sure what's causing the sort order of the saved FEs to change. This also causes xbd
and resid
to change, if they are used.
@sergiocorreia Actually they're just wrong, sorry for the multiple messages:
clear
sysuse auto, clear
ivreghdfe price (mpg = turn), absorb(a1=rep78) cluster(rep78)
ivreghdfe price (mpg = turn), absorb(a2=rep78) cluster(rep78)
reghdfe mpg turn, absorb(rep78) resid
predict mpghat, xbd
reghdfe price mpghat, absorb(a3=rep78) cluster(rep78)
gen diff1 = reldif(a1, a3)
gen diff2 = reldif(a2, a3)
sum diff?
Variable | Obs Mean Std. Dev. Min Max
-------------+---------------------------------------------------------
diff1 | 74 93.00652 119.3723 0 300.4831
diff2 | 74 89.67282 112.5629 0 285.0808
For some reason, the problem is solved when pre-sorting by the fixed effects before the run:
sysuse auto, clear
sort rep78
ivreghdfe price turn, absorb(a1=rep78) cluster(rep78)
ivreghdfe price turn, absorb(a2=rep78) cluster(rep78)
reghdfe price turn, absorb(a3=rep78) cluster(rep78)
replace a3 = a3 + _b[_cons]
gen diff1 = reldif(a1, a3)
gen diff2 = reldif(a2, a3)
sum diff?
But need to delve a bit more into why the sorting gets broken.