StatsModels.jl icon indicating copy to clipboard operation
StatsModels.jl copied to clipboard

Capture variables in formula from enclosing scope

Open CameronBieganek opened this issue 5 years ago • 2 comments

It would be really handy if formulas could capture variables from enclosing scopes. For example:

julia> using StatsModels, DataFrames

julia> df = DataFrame(x = randn(10), y = randn(10));

julia> T = 10
10

julia> modelmatrix(@formula(y ~ cos(x / T)), df)
ERROR: type NamedTuple has no field T
Stacktrace:
 [1] _broadcast_getindex_evalf at ./broadcast.jl:578 [inlined]
 [2] _broadcast_getindex at ./broadcast.jl:551 [inlined]
 [3] #19 at ./broadcast.jl:953 [inlined]
 [4] ntuple at ./tuple.jl:160 [inlined]
 [5] copy at ./broadcast.jl:953 [inlined]
 [6] materialize at ./broadcast.jl:753 [inlined]
 [7] modelcols(::FunctionTerm{typeof(cos),getfield(Main, Symbol("##3#5")),(:x, :T)}, ::NamedTuple{(:x, :y),Tuple{Array{Float64,1},Array{Float64,1}}}) at /home/cbieganek/.julia/packages/StatsModels/SDWnE/src/terms.jl:472
 [8] #24 at ./none:0 [inlined]
 [9] iterate at ./generator.jl:47 [inlined]
 [10] collect at ./array.jl:606 [inlined]
 [11] modelcols(::MatrixTerm{Tuple{FunctionTerm{typeof(cos),getfield(Main, Symbol("##3#5")),(:x, :T)}}}, ::NamedTuple{(:x, :y),Tuple{Array{Float64,1},Array{Float64,1}}}) at /home/cbieganek/.julia/packages/StatsModels/SDWnE/src/terms.jl:520
 [12] #modelmatrix#45(::Dict{Symbol,Any}, ::Type{StatisticalModel}, ::Function, ::FunctionTerm{typeof(cos),getfield(Main, Symbol("##3#5")),(:x, :T)}, ::DataFrame) at /home/cbieganek/.julia/packages/StatsModels/SDWnE/src/modelframe.jl:100
 [13] #modelmatrix#44 at /home/cbieganek/.julia/packages/StatsModels/SDWnE/src/modelframe.jl:97 [inlined]
 [14] modelmatrix(::FormulaTerm{Term,FunctionTerm{typeof(cos),getfield(Main, Symbol("##3#5")),(:x, :T)}}, ::DataFrame) at /home/cbieganek/.julia/packages/StatsModels/SDWnE/src/modelframe.jl:93
 [15] top-level scope at none:0

The corresponding code in R works as expected:

> df <- data.frame(x = rnorm(10), y = rnorm(10))
> T <- 10
> model.matrix(y ~ cos(x/T), df)
   (Intercept)  cos(x/T)
1            1 0.9972878
2            1 0.9976263
3            1 0.9901339
4            1 0.9964463
5            1 0.9969247
6            1 0.9999974
7            1 0.9967823
8            1 0.9958416
9            1 0.9986590
10           1 0.9916724
attr(,"assign")
[1] 0 1

CameronBieganek avatar Jul 11 '19 16:07 CameronBieganek

This is actually really tricky to do with Julia. One possibility I'd considered was to use $ to capture things from local scope at the time @formula was invoked, so you could do @formula(y ~ cos(x/$T)) but only if T was defined in the same scope that @formula was called (inspired by how BenchmarkTools handles this).

kleinschmidt avatar Jul 11 '19 22:07 kleinschmidt

Using $ for this seems reasonable to me, though I'm no expert here.

CameronBieganek avatar Jul 12 '19 19:07 CameronBieganek