RData.jl icon indicating copy to clipboard operation
RData.jl copied to clipboard

Add "unique_colnames" keyword option to load()

Open alyst opened this issue 5 years ago • 5 comments

As mentioned in #44 , it would be nice to add unique_colnames=true option to load(), which would propagate to DataFrame({r data frame}, makeunique=unique_colnames). That will make the default behaviour reasonable, while still allowing the user to control it.

alyst avatar Aug 10 '18 11:08 alyst

Why not call that argument makeunique for consistency?

nalimilan avatar Aug 17 '18 14:08 nalimilan

Why not call that argument makeunique for consistency?

load() can return different objects. I'm not sure it is as clear as in DataFrame ctor that makeunique refers to the columns of data frames (or matrices). What if the user expects a vector of strings, then makeunique (which defaults to true) would be really confusing.

alyst avatar Aug 17 '18 14:08 alyst

Yeah, maybe. However, it's not very explicit either for the DataFrame constructor (where it could also mean e.g. that rows should all be unique).

nalimilan avatar Aug 17 '18 18:08 nalimilan

I would be very much in favor of using the same corrected keyword as DataFrame ctor. Maybe something like uniquenames? Or fixnames, meaning that it will both make the name unique and a correct Julia identifier. In case of load("*.rda") it can potentially apply to DataFrame column names, matrix row/column names, list/vector element names, top level identifier names (variables).

alyst avatar Aug 20 '18 13:08 alyst

Makes sense. We need a quite strong motivation to change the name in DataFrames though, given the burden it imposes on users.

nalimilan avatar Aug 27 '18 20:08 nalimilan