DataFramesMeta.jl
DataFramesMeta.jl copied to clipboard
Add the `@rename` macro
It's not strictly necessary, because DataFrames.rename
exists, but it is nice to have the keyword argument form.
Thanks!
This will be good to add pre 1.0.
Bumping this. Current struggling to rename something programmatically in dplyr
. Would be great to have this in DataFramesMeta.jl
@pdeffebach would something silly like this be acceptable?
using DataFrames
using Chain
macro testrename(x, arg)
esc(testrename_helper(x, (arg)))
end
function testrename_helper(x, arg)
t = arg
quote
$DataFrames.rename($x, $t)
end
end
df = DataFrame(x=rand(10), y=rand(10))
df[!, Symbol("Y 2")] = randn(10)
@chain df begin
@testrename [:x, :y] .=> [:apple, :banana]
@testrename Symbol("Y 2") => :pineapple
end
Hi, @pdeffebach. I want to bump this to discuss the design. Should the final version of @rename
be something like
df(old_col1 = 1:10, old_col2 = 1:10)
@rename(df :new_col = :old_col)
@rename df begin
:new_col1 = :old_col1
:new_col2 = :old_col2
end
Please let me know what you think!
Yes. I guess rename
should allow arbitrary expressions in the RHS so that you can do
@rename df :newcol = begin
"old_col" * "1"
end
and of course `@rename df $(:x => :y)
Yes. I guess rename should allow arbitrary expressions in the RHS so that you can do
To make sure I'm understanding this correctly, would the DataFrames
version of achieving this be something like
df = DataFrame(a_col = rand(10))
rename(x->x * "1", df)
No. see the :newcol = ...
at the top, it would be equivalent to
rename(df, ("old_col" * "1") => :new_col)
@rename
should only ever create a rename(df, a => b, c => d)
type-of expression. No need to bother with the function stuff.
Got it, thanks. I currently have the following:
@rename df :new1 = begin
$("old_col" * "1")
end
however, arbitrary RHS expressions need to be escaped. Further, arbitrary RHS expressions can not be mixed with other standard expressions. For example
@rename(df, :new1 = $("old_col" * "1"), :new2 = :old_col2)
errors in the same manner as
rename(df, ("old_col" * "1") => :new1, :old_col2 => :new2)
ERROR: MethodError: no method matching rename!(::DataFrame, ::Vector{Pair{A, Symbol} where A})
Which I think makes sense if this implementation is to rely on rename
. Is this sufficient?
The error above is because all the old names need to be the same type and all the new names need to be the same type. Just cast everything to string
in the final expression. i.e., produce
rename(df, string(("old_col" * "1")) => string(:new1), string(:old_col2) => string(:new2))