JET.jl icon indicating copy to clipboard operation
JET.jl copied to clipboard

JET does not know that concatenated vectors results in a vector?

Open Krastanov opened this issue 3 years ago • 4 comments

Here are two false positives that I have trouble understanding:

using JET

function f(idx)
    N = 4
    indices_flat = [idx...;]
    Int[i for i=1:N if i ∉ indices_flat]
end

first false positive:

@assert f([[1,2],[3]]) == [4]
@report_call f([[1,2],[3]])
═════ 2 possible errors found ═════
┌ @ REPL[2]:4 collect(Int, Base.Generator(identity, Base.Filter(#3, 1 : N)))
┌ @ array.jl:642 Base._collect(T, itr, Base.IteratorSize(itr))
┌ @ array.jl:644 Base._array_for(T, isz, Base._similar_shape(itr, isz))
┌ @ array.jl:674 Base._similar_shape(itr, isz)
┌ @ array.jl:659 axes(itr)
┌ @ abstractarray.jl:98 size(A)
│ no matching method found `size(::Base.HasLength)`: size(A::Base.HasLength)
└───────────────────────
┌ @ array.jl:658 length(itr)
│ no matching method found `length(::Base.HasLength)`: length(itr::Base.HasLength)
└────────────────

second false positive:

@assert f([[],[]]) == [1,2,3,4]
@report_call f([[],[]])
═════ 1 possible error found ═════
┌ @ REPL[2]:4 collect(Int, Base.Generator(identity, Base.Filter(#3, 1 : N)))
┌ @ array.jl:642 Base._collect(T, itr, Base.IteratorSize(itr))
┌ @ array.jl:648  = iterate(itr)
┌ @ generator.jl:44 y = iterate(tuple(g.iter), s...)
┌ @ iterators.jl:510 goto %11 if not f.flt(y[1])
│ non-boolean `Missing` found in boolean context (1/2 union split): goto %11 if not (f::Base.Iterators.Filter{var"#3#4"{Vector{Any}}, UnitRange{Int64}}).flt::var"#3#4"{Vector{Any}}((y::Tuple{Int64, Int64})[1]::Int64)::Union{Missing, Bool}
└────────────────────
julia> versioninfo()
Julia Version 1.9.0-DEV.1657
Commit 35d12890aba (2022-10-25 07:47 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 8 × Intel(R) Core(TM) i7-10510U CPU @ 1.80GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, skylake)
  Threads: 4 on 8 virtual cores

(ttt) pkg> st
Status `/tmp/ttt/Project.toml`
  [c3a54625] JET v0.6.14

Krastanov avatar Nov 13 '22 15:11 Krastanov

It looks like the former example is because Base.Generator seems to lose a precise type information (see ::Base.Generator{_A, typeof(identity)} where _A in the second line of abstract stack trace):

julia> callf(f, args...) = f(args...)
callf (generic function with 1 method)

julia> @report_call annotate_types=true callf(f, [[1,2],[3]])
═════ 2 possible errors found ═════
┌ @ REPL[6]:1 f::typeof(f)(args::Tuple{Vector{Vector{Int64}}}...)
│┌ @ REPL[2]:4 collect(Int, Base.Generator(identity, Base.Filter(#3, 1 : N::UnitRange{Int64})::Base.Iterators.Filter{_A, UnitRange{Int64}} where _A)::Base.Generator{_A, typeof(identity)} where _A)
││┌ @ array.jl:642 Base._collect(T::Type{Int64}, itr::Base.Generator{_A, typeof(identity)} where _A, Base.IteratorSize(itr::Base.Generator{_A, typeof(identity)} where _A)::Any)
│││┌ @ array.jl:644 Base._array_for(T::Type{Int64}, isz::Union{Base.HasLength, Base.HasShape}, Base._similar_shape(itr::Base.Generator{_A, typeof(identity)} where _A, isz::Union{Base.HasLength, Base.HasShape})::Any)
││││┌ @ array.jl:674 Base._similar_shape(itr::Base.HasLength, isz::Any)
│││││┌ @ array.jl:659 axes(itr::Base.HasLength)
││││││┌ @ abstractarray.jl:98 size(A::Base.HasLength)
│││││││ no matching method found `size(::Base.HasLength)`: size(A::Base.HasLength)
││││││└───────────────────────
│││││┌ @ array.jl:658 length(itr::Base.HasLength)
││││││ no matching method found `length(::Base.HasLength)`: length(itr::Base.HasLength)
│││││└────────────────

The second example isn't false positive though, since the vector is typed as Vector{Any}, from the type inference point of view, it may return missing element so the non-boolean condition can actually happen.

julia> @report_call annotate_types=true callf(f, [[],[]])
═════ 1 possible error found ═════
┌ @ REPL[6]:1 f::typeof(f)(args::Tuple{Vector{Vector{Any}}}...)
│┌ @ REPL[2]:4 collect(Int, Base.Generator(identity, Base.Filter(#3, 1 : N::UnitRange{Int64})::Base.Iterators.Filter{var"#3#4"{Vector{Any}}, UnitRange{Int64}})::Base.Generator{Base.Iterators.Filter{var"#3#4"{Vector{Any}}, UnitRange{Int64}}, typeof(identity)})
││┌ @ array.jl:642 Base._collect(T::Type{Int64}, itr::Base.Generator{Base.Iterators.Filter{var"#3#4"{Vector{Any}}, UnitRange{Int64}}, typeof(identity)}, Base.IteratorSize(itr::Base.Generator{Base.Iterators.Filter{var"#3#4"{Vector{Any}}, UnitRange{Int64}}, typeof(identity)})::Base.SizeUnknown)
│││┌ @ array.jl:648  = iterate(itr::Base.Generator{Base.Iterators.Filter{var"#3#4"{Vector{Any}}, UnitRange{Int64}}, typeof(identity)})
││││┌ @ generator.jl:44 y = iterate(tuple((g::Base.Generator{Base.Iterators.Filter{var"#3#4"{Vector{Any}}, UnitRange{Int64}}, typeof(identity)}).iter::Base.Iterators.Filter{var"#3#4"{Vector{Any}}, UnitRange{Int64}})::Tuple{Base.Iterators.Filter{var"#3#4"{Vector{Any}}, UnitRange{Int64}}}, s::Tuple{}...)
│││││┌ @ iterators.jl:514 goto %12 if not (f::Base.Iterators.Filter{var"#3#4"{Vector{Any}}, UnitRange{Int64}}).flt::var"#3#4"{Vector{Any}}((y::Tuple{Int64, Int64})[1]::Int64)::Union{Missing, Bool}
││││││ non-boolean `Missing` found in boolean context (1/2 union split): goto %12 if not (f::Base.Iterators.Filter{var"#3#4"{Vector{Any}}, UnitRange{Int64}}).flt::var"#3#4"{Vector{Any}}((y::Tuple{Int64, Int64})[1]::Int64)::Union{Missing, Bool}
│││││└────────────────────


julia> callf(f, [Any[missing],Any[missing]])
ERROR: TypeError: non-boolean (Missing) used in boolean context
Stacktrace:
 [1] iterate
   @ ./iterators.jl:514 [inlined]
 [2] iterate
   @ ./generator.jl:44 [inlined]
 [3] _collect(#unused#::Type{Int64}, itr::Base.Generator{Base.Iterators.Filter{var"#3#4"{Vector{Any}}, UnitRange{Int64}}, typeof(identity)}, isz::Base.SizeUnknown)
   @ Base ./array.jl:648
 [4] collect
   @ ./array.jl:642 [inlined]
 [5] f(idx::Vector{Vector{Any}})
   @ Main ./REPL[2]:4
 [6] callf(f::Function, args::Vector{Vector{Any}})
   @ Main ./REPL[6]:1
 [7] top-level scope
   @ REPL[9]:1

aviatesk avatar Nov 16 '22 06:11 aviatesk

Probably a silly question: why was the callf indirection necessary? I do not think I understand this trick.

Is it feasible to help Generator not lose this type information? Should I be filing an issue with julialang/julia?

Krastanov avatar Nov 16 '22 06:11 Krastanov

why was the callf indirection necessary?

Ah, I just wanted to check if @report_call generate a reasonable type information about input function call e.g. f::typeof(f)(args::Tuple{Vector{Vector{Int64}}}...) (, which we can get with annotate_types option enabled).

Is it feasible to help Generator not lose this type information? Should I be filing an issue with julialang/julia?

Yes, this seems to be a general type inference issue within Julia base. It looks like we can't have a precise return type inference because of this:

julia> @code_typed f([[1,2],[3]])
CodeInfo(
1 ─ %1 = Core._apply_iterate(Base.iterate, Base.vcat, idx)::Union{Vector{Any}, Vector{Int64}}
│   %2 = Core.typeof(%1)::Union{Type{Vector{Any}}, Type{Vector{Int64}}}
│   %3 = Core.apply_type(Main.:(var"#3#4"), %2)::Type{var"#3#4"{_A}} where _A
│   %4 = %new(%3, %1)::var"#3#4"
│   %5 = Base.Filter(%4, $(QuoteNode(1:4)))::Base.Iterators.Filter{_A, UnitRange{Int64}} where _A
│   %6 = Base.Generator(Base.identity, %5)::Base.Generator{_A, typeof(identity)} where _A
│   %7 = Base.collect(Main.Int, %6)::AbstractArray
└──      return %7
) => AbstractArray

aviatesk avatar Nov 18 '22 07:11 aviatesk

Should be fixed by: https://github.com/JuliaLang/julia/pull/47628

aviatesk avatar Nov 18 '22 08:11 aviatesk