FixedEffectModels.jl
FixedEffectModels.jl copied to clipboard
Ignore rows with `Inf`s?
The following seems like a 'classic' trap:
using FixedEffectModels, RDatasets
df = dataset("plm", "Cigar")
# assume some entries are Inf
df.Sales[1] = 0.0
df.logsales = log.(df.Sales)
reg(df, @formula(logsales ~ NDI + fe(State) + fe(Year)), Vcov.cluster(:State), weights = :Pop)
gives
ERROR: "Some observations for the dependent variable are infinite"
Stacktrace:
[1] reg(df::Any, formula::FormulaTerm, vcov::StatsBase.CovarianceEstimator; contrasts::Dict, weights::Union{Nothing, Symbol}, save::Union{Bool, Symbol}, method::Symbol, nthreads::Integer, double_precision::Bool, tol::Real, maxiter::Integer, drop_singletons::Bool, progress_bar::Bool, dof_add::Integer, subset::Union{Nothing, AbstractVector}, first_stage::Bool)
@ FixedEffectModels ~/.julia/packages/FixedEffectModels/kJPKw/src/fit.jl:176
[2] top-level scope
@ REPL[9]:1
I feel the package could automatically drop rows where the regressand or one of the regressors is infinite, similarly to how it does with missings. What's the argument against that?
It's a bit tricky because Inf is meaningful. I will leave this issue open so that people can report if they encounter the same issue.
I run into this sometimes as well. The R fixest package also automatically drops those observations.
I think a "do what I mean" approach (dropping Inf) is unidiomatic in the Julia Stats ecosystem. If rows are being ignored at a minimum there should be a warning.