Statistics.jl
Statistics.jl copied to clipboard
median with missing data and reduction over a given dimension
The following call to median and sum work:
using Statistics
sum([1. 2. missing; 2. 3. 4.]; dims = 1)
# 1×3 Array{Union{Missing, Float64},2}:
# 3.0 5.0 missing
and
using Statistics
median([1., 2., missing]) # returns missing
However a call to median with a missing value and a reduction over a specified dimension fails:
julia> median([1. 2. missing; 2. 3. 4.]; dims = 1)
ERROR: MethodError: Cannot `convert` an object of type Missing to an object of type Float64
Closest candidates are:
convert(::Type{T<:Number}, ::T<:Number) where T<:Number at number.jl:6
convert(::Type{T<:Number}, ::Number) where T<:Number at number.jl:7
convert(::Type{T<:Number}, ::Base.TwicePrecision) where T<:Number at twiceprecision.jl:250
...
Stacktrace:
[1] setindex!(::Array{Float64,2}, ::Missing, ::Int64) at ./array.jl:767
[2] setindex! at ./subarray.jl:293 [inlined]
[3] macro expansion at ./broadcast.jl:843 [inlined]
[4] macro expansion at ./simdloop.jl:73 [inlined]
[5] copyto! at ./broadcast.jl:842 [inlined]
[6] copyto! at ./broadcast.jl:797 [inlined]
[7] materialize!(::SubArray{Float64,1,Array{Float64,2},Tuple{Base.OneTo{Int64},Int64},true}, ::Base.Broadcast.Broadcasted{Base.Broadcast.Style{Tuple},Nothing,typeof(identity),Tuple{Tuple{Missing}}}) at ./broadcast.jl:756
[8] concatenate_setindex!(::Array{Float64,2}, ::Missing, ::Base.OneTo{Int64}, ::Vararg{Any,N} where N) at ./abstractarray.jl:2005
[9] inner_mapslices!(::Bool, ::Base.Iterators.Drop{CartesianIndices{1,Tuple{Base.OneTo{Int64}}}}, ::Int64, ::Array{Any,1}, ::Array{Int64,1}, ::Array{Any,1}, ::Array{Union{Missing, Float64},1}, ::Array{Union{Missing, Float64},2}, ::typeof(median!), ::Array{Float64,2}) at ./abstractarray.jl:1986
[10] #mapslices#109(::Int64, ::Function, ::typeof(median!), ::Array{Union{Missing, Float64},2}) at ./abstractarray.jl:1976
[11] #mapslices at ./none:0 [inlined]
[12] _median at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Statistics/src/Statistics.jl:757 [inlined]
[13] #median#44 at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Statistics/src/Statistics.jl:755 [inlined]
[14] (::getfield(Statistics, Symbol("#kw##median")))(::NamedTuple{(:dims,),Tuple{Int64}}, ::typeof(median), ::Array{Union{Missing, Float64},2}) at ./none:0
[15] top-level scope at none:0
I see this behaviour in julia 1.0.3 and julia 1.1.0. I think the behaviour should be similar to sum and not produce an error.
Is this a known issue?
It's due to a limitation of mapslices. See https://github.com/JuliaLang/julia/pull/31217.
We should maybe switch to map(median, eachslice(X; dims = ...)) as suggested by @mbauman in that PR.