Missings.jl
Missings.jl copied to clipboard
add @? macro
I think it is high time to make a bold decision and add @?
to Missings.jl. The proposal is slightly modified original @tkf code.
@pdeffebach + @jkrumbiegel I was not sure how to properly add a docstring which should be:
"""
@?
Wrap an expression `ex` into `Union{Missing, ex}`.
```jldoctest
julia> @?String
Union{Missing, String}
julia> @?(Int)[1, 2, 3]
3-element Vector{Union{Missing, Int64}}:
1
2
3
```
"""
CC @nalimilan
if we're cool with requiring Julia 1.3+ (I am at least 😛)
Good point. I am OK to require 1.3+. In downstream packages (like DataFrames.jl) we can allow several versions of Missings.jl. Also I assume that soon 1.0 will stop being LTS and then we will stop supporting it.
Technically we could allow both (by checking VERSION
) but I think it is time to slowly start dropping 1.0 support in JuliaData ecosystem (we can wait with merging the PR).
but I think it is time to slowly start dropping 1.0 support in JuliaData ecosystem
We've already moved that way with Arrow.jl, and a few others I think. I've heard through the grapevine that 1.6 or 1.7 will be a new LTS, so I agree it should be fine to start officially dropping 1.0.
Missings.jl is more delicate as it is a dependency of e.g DataFrames.jl so this decision should be coordinated across the whole ecosystem.
@oxinabox - would dropping of support of Julia 1.0 be considered breaking following the semver?
In this PR (if we want to leave this decision for later and if we agree that @?
macro is useful to be added here) I can still make this PR supported under Julia 1.0 (and leave a TODO to clean it up later).
not breaking. as it can't break anyones code as the package manager will not allow old users to see it.
I have made it work in all versions post 1.0 including. Just on old versions I skipped docstring.
Seeing this in JuliaData, I imagine there has been extensive discussion on this topic. But I think it's rather unfortunate to use such a terse syntax "just" for syntax sugar that can even be implemented as a type alias (e.g., MayNotBe{T} = Union{T,Missing}
). At the same time, I'd imagine a shorthand is valuable for readability if you need to write it in gazillion of types.
Maybe we can avoid "wasting" the syntax space too much by only using @?{...}
? Something like
macro var"?"(ex)
(Meta.isexpr(ex, :braces) && length(ex.args) == 1) || error("...")
t, = ex.args
:(Union{$(esc(t)), Missing})
end
so that @?{Int} == Union{Int,Missing}
?
It'd let us use, e.g., @? f(g(x, y), z)
for (maybe short-circuiting) multi-call "lifting" syntax in the future.
I imagine there has been extensive discussion on this topic
There have been lengthily discussions and they all end up in deciding not to decide yet.
Ideally it would be Int?
a shorthand is valuable for readability if you need to write it in gazillion of types.
This is the point. An example from a recent user question:
dst = DataFrame(A=Union{Int,Missing}[],
B=Union{Int,Missing}[],
C=Union{Int,Missing}[],
D=Union{Int,Missing}[],
E=Vector{Union{Int,Missing}}[],
F=Vector{Union{Int,Missing}}[],
G=Vector{Union{Int,Missing}}[],
Ga=Vector{Union{Int,Missing}}[],
H=Union{Int,Missing}[])
It is really painful to type and read.
And I think we just have to decide on something as JuliaData ecosysem gets more adoption. The reason why it is proposed in Missings.jl is exactly that not all people in general need to want to use it.
@?{...}
This is one option.
Another options that came to my mind are:
Maybe{T} = Union{Missing, T} # not so good as Maybe typically is with Nothing, but we have Some already
and
Base.Missing(T::Type) = Union{T, Missing} # as constructor does not guarantee to return an instance
then we would write Missing(Int)
which is not as good as Int?
but a bit shorter than Union{Missing, Int}
.
Let us discuss. There is no rush to decide.
Maybe{T} = Union{Missing, T}
Is Maybe
used for Union{T,Missing}
in the data analysis field? As you mentioned, I associate Maybe
rather to Union{Some{T},Nothing}
since that's what other languages use: https://en.wikipedia.org/wiki/Option_type#Names_and_definitions. If it's nonstandard, avoiding name collision may be a good idea.
(Conflict of interest: I wrote Maybe.jl which is built around the idea of improving the usability of Union{Some{T},Nothing}
. So, I'd say I'm biased :slightly_smiling_face:)
Missing(Int)
I strongly suggest that we should keep the invariance C(...) :: C
for any type C
. I know the Julia manual says it's OK to not do so in "very rare cases" but I personally think it should just be discouraged and error-checked by the compiler in the future. It's easy to find code that depends on this invariance, even in Base
.
@tkf - I agree with both comments in general. I am just looking for some solution that would shorten the code for users. Let us wait - maybe someone will come up with another proposals.
Clever PR! I'm not sure how I feel about this. @tkf's remarks are very relevant. I agree that Maybe
would be more appropriate for Union{T, Nothing}
or Union{Some{T}, Nothing}
. But other solutions are possible, like MissingOr{T}
or maybe even M{T}
(which isn't worse than @?
in terms of explicitness). That said, the need to type braces (and a type alias) makes a special syntax less useful that just @?
...
At any rate if we went with @?
I think we should discuss this more widely with Julia developers, as it's very similar to T?
so it would be confusing if that syntax was finally used to mean Union{T, Nothing}
.
OK - let us call @ViralBShah then and ask him whom we should ask 😄.
@StefanKarpinski is the person whose attention needs to be drawn here. And also @jeffbezanson, if he wants to chime in.