reghdfe icon indicating copy to clipboard operation
reghdfe copied to clipboard

issue with if statements, factor variables, negative values

Open mdroste opened this issue 5 years ago • 1 comments

This is a minor bug, but because it contrasts with usage for -regress- I wanted to flag it. Basically, Stata doesn't allow factor variables with negative values, but it does when -if- conditions are applied to exclude the negative values. However, reghdfe throws an error in this case.

Consider the following MWE:

clear all
set obs 10000
gen x = runiform()
gen z = ceil(runiform()*11-6)
gen g = ceil(runiform()*10)
gen y = g + z + x + rnormal()

* this works fine
reg y x i.z if z>=0, absorb(g)

* this throws an error about negative values for z
reghdfe y x i.z if z>=0, absorb(g)

Users can throw in z into the absorb statement, potentially saving the FEs out to variables, and that works fine (or just add a constant to z so that it's always non-negative and adjust the if condition accordingly).

mdroste avatar Mar 04 '19 12:03 mdroste

I looked into it but it seems the problem needs to be solved from Stata's side:

sysuse auto, clear
replace rep = rep - 3
gen byte touse = rep >= 0
mata: st_data(., "i.rep", "touse")

Even if the touse variable restricts the sample to the non-negative range of rep, Mata refuses to load the data. The only workaround would be to preserve and drop observations beforehand, but that comes at a high speed cost.

sergiocorreia avatar Mar 07 '19 11:03 sergiocorreia