reghdfe
reghdfe copied to clipboard
issue with if statements, factor variables, negative values
This is a minor bug, but because it contrasts with usage for -regress- I wanted to flag it. Basically, Stata doesn't allow factor variables with negative values, but it does when -if- conditions are applied to exclude the negative values. However, reghdfe throws an error in this case.
Consider the following MWE:
clear all
set obs 10000
gen x = runiform()
gen z = ceil(runiform()*11-6)
gen g = ceil(runiform()*10)
gen y = g + z + x + rnormal()
* this works fine
reg y x i.z if z>=0, absorb(g)
* this throws an error about negative values for z
reghdfe y x i.z if z>=0, absorb(g)
Users can throw in z into the absorb statement, potentially saving the FEs out to variables, and that works fine (or just add a constant to z so that it's always non-negative and adjust the if condition accordingly).
I looked into it but it seems the problem needs to be solved from Stata's side:
sysuse auto, clear
replace rep = rep - 3
gen byte touse = rep >= 0
mata: st_data(., "i.rep", "touse")
Even if the touse
variable restricts the sample to the non-negative range of rep
, Mata refuses to load the data. The only workaround would be to preserve and drop observations beforehand, but that comes at a high speed cost.