Query.jl
Query.jl copied to clipboard
Make $x syntactic sugar for _.x
I'm not sure this is really a good idea, but I'd like to get some feedback :) The idea would be that $x
is just syntactic sugar for _.x
in the standalone query commands. So instead of writing:
data |> @mutate(x = 2*_.x)
one could write
data |> @mutate(x = 2 * $x)
The benefit would be that one saves on character.
I would really like to see some syntactic sugar for _.x
.
I think $
is nice in that with data manipulations it's almost analogous to interpolation. However, it being almost the same as interpolation in strings, commands or expressions does introduce some confusion about what a user should expect from this operator.
I particularly like DataFramesMeta.jl
approach by using Symbol
s in that it is (in most cases) quite apparent that the code would cause an error if run as-is. It's fairly self-evident that it's not behaving as you'd typically expect.
In this case, I think it's better to be obvious that the syntax doesn't conform to other paradigms of the language than it is to try to find an almost analogous behavior to map it to.
A bit of brainstorming on the topic: Running with the Symbol syntax, perhaps $
would be better suited for interpolating expressions such that you could programmatically access columns similar to R's rlang::`!!`
and it's wrapper tidyeval
. (some dplyr
examples in the tidyeval
cookbook)
Interpolating Symbols
y = :x
data |> @mutate(x = 2 * $y) # analogous to `rlang::!!`
equal to
data |> @mutate(x = 2 * :x)
Interpolating Expressions
y = :(2 * :x)
data |> @mutate(x = $y)
equal to
data |> @mutate(x = 2 * :x)
That said, I can't think of a way of implementing something like this as a macro step off the top of my head - maybe with a custom constructor for these data-specific expressions.
Perhaps I'm missing something, but I think $
used outside an expression, string or command throws a syntax error before it can even be intercepted by a macro.
julia> macro test_dollar(e)
println(e)
e
end
julia> y = :x
julia> test_dollar($y)
# ERROR: syntax: "$" expression outside quote
# Stacktrace:
# [1] top-level scope at REPL[3]:1
Coming from R and the Tidyverse, I saw the way that DataFrames.jl uses Symbol
s, e.g. when joining:
join(people, jobs, on = :ID)
or reshaping:
stack(iris, [:SepalLength, :SepalWidth, :PetalLength, :PetalWidth])
So I was slightly confused by using _.
, as Symbol
s already seemed like a good fit and were consistent with other packages.
It was one of the reasons I started learning DataFramesMeta.jl instead of Query.jl. Using _.
reminded me of the Python packages that try to bend Python's syntax to be more like dplyr.
Symbol
would also be consistent with VegaLite.jl.