RCall.jl
RCall.jl copied to clipboard
Both DataValues and RCall export isna
I just realized that on julia 1.0, both DataValues.jl and RCall.jl are exporting a isna
function. That is not ideal, because I would think that both packages will end up being loaded simultaneously quite often and then folks can't properly use the function...
Here is some background info: DataValues.jl uses NA
as its value for missing data. On julia 0.6 and earlier, it used isnull
for checking for missing data. I never liked that and always thought it should be isna
, but then DataArrays.jl exported isna
for a long time and I didn't want to take a dependency on it. Plus, after the transition to Missing
, DataArrays.jl of course also no longer exported isna
. At that point I still didn't switch over to isna
because I couldn't be sure what version of DataArrays.jl folks would actually have, and I didn't want any conflicts to arise. But the julia 1.0 version of DataValues.jl now makes isna
the main and only way to check for missing values. I had thought that no other package was still using it, so I thought it would be unproblematic to "claim" that name for DataValue
. I wasn't aware that RCall.jl was also using it.
Not sure what the best solution is... Here are some options: 1) RCall.jl uses something else (maybe ismissing
?) instead, 2) DataValues.jl uses something else, 3) RCall.jl takes a dependency on DataValues.jl and we all use the same isna
function or 4) we create a common root package that only has a isna
definition in it, both RCall.jl and DataValues.jl take a dep on that and we again share a common function...
I don't know how problematic 1) would be, obviously from my point of view it would solve the problem, but I could entirely understand if that is not an option. Same for 3). I think I'm 99% certain that 2) is not really an option from the Queryverse point of view (a fair number of reasons, happy to explain, if anyone is interested). 4) seems a bit silly, but on the other hand it would solve this issue without anyone having to really make any difficult changes. Maybe there are other options out there as well?
CC @nalimilan
isna
corresponds directly to the R is.na
function, so I don't think renaming is really an option.
To be honest, I don't think the status quo (forcing users to qualify their usage) is that big of a problem.
You know my position already: move DataValues to ismissing
. :-)
I probably put the isna
function in RCall
initially because I was duplicating the R API in Julia code and it was before Missing
was part of Julia. It would be quite sensible for RCall
to define methods for ismissing
instead of isna
. Of course, @randy3k is the most knowledgeable person to judge that.
Among the proposed solutions my least favorite is having RCall
depend on DataValues
. I hope that the Julia Data ecosystem will move towards using the standard Missing
type and Union{Missing,BitsType}
for missing data representation exclusively and not propagate multiple standards. We have enough work to do as it is; let's not compound our problems.
I am with @simonbyrne. isna
is the most natural function name for checking R's NA value. And we may want to distinguish R's NA ~Verdi~ versus (damn auto correction) Julia “NA” type Missing
. I am also curious why DataValues needs a new missing data representation at all?
Unless there are enough reasons to support DataValues as the default coninical Julia missing type for R objects, I don’t see depedency of DataValues.jl is a viable option.