julia
julia copied to clipboard
Internal Error/StackOverflowError in type inference
Reproducer can be found here. Run as julia reproducer.jl, it takes care of a temp environment and such.
What I think is happening is that the definition of iterate here kills type inference when trying to iterate over the result of a recursive buildup of Flatten{T}. That definition is copied verbatim from Base.Iterators.flatten and also contains this line:
y = iterate(Base.tail(state)...)
which potentially splats a very large/deeply nested tuple. I think this is what leads to the type inference death in the end and since this is copied from Base.Iterators.flatten, the Base version should also have this same problem. It's hard to manifest, as recursively building up a regular flatten already gives Any much sooner and the subsequent loss of performance makes this crash infeasible, which was the motivation for building Flatten{T} in the first place as I can guarantee here that the eltype will always be the same.
I'm fairly certain this is the cause, as interrupting the "waiting" code before hitting the internal error leads to this stacktrace:

I know that splatting large things is bad, so I'll move to a queue/Channel based design soon™, but I don't think it should crash/throw that unsightly internal error here..
Initial internal error
Internal error: encountered unexpected error in runtime:
StackOverflowError()
is_derived_type at ./compiler/typelimits.jl:39 # I've also seen 65 here
is_derived_type at ./compiler/typelimits.jl:66
is_derived_type at ./compiler/typelimits.jl:66
is_derived_type at ./compiler/typelimits.jl:66
is_derived_type at ./compiler/typelimits.jl:66
is_derived_type at ./compiler/typelimits.jl:66
is_derived_type at ./compiler/typelimits.jl:66
is_derived_type at ./compiler/typelimits.jl:66
is_derived_type at ./compiler/typelimits.jl:66
is_derived_type at ./compiler/typelimits.jl:66
is_derived_type at ./compiler/typelimits.jl:66
is_derived_type at ./compiler/typelimits.jl:66
Transition to `abstract_call`
is_derived_type at ./compiler/typelimits.jl:66
is_derived_type at ./compiler/typelimits.jl:66
is_derived_type_from_any at ./compiler/typelimits.jl:74
type_more_complex at ./compiler/typelimits.jl:196
limit_type_size at ./compiler/typelimits.jl:21
abstract_call_method at ./compiler/abstractinterpretation.jl:454
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:105
abstract_call_known at ./compiler/abstractinterpretation.jl:1341
abstract_call at ./compiler/abstractinterpretation.jl:1396
abstract_apply at ./compiler/abstractinterpretation.jl:997
abstract_call_known at ./compiler/abstractinterpretation.jl:1258
abstract_call at ./compiler/abstractinterpretation.jl:1396
abstract_call at ./compiler/abstractinterpretation.jl:1381
abstract_eval_statement at ./compiler/abstractinterpretation.jl:1536
typeinf_local at ./compiler/abstractinterpretation.jl:1901
typeinf_nocycle at ./compiler/abstractinterpretation.jl:2017
_typeinf at ./compiler/typeinfer.jl:226
typeinf at ./compiler/typeinfer.jl:209
typeinf_edge at ./compiler/typeinfer.jl:825 [inlined]
abstract_call_method at ./compiler/abstractinterpretation.jl:504
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:105
abstract_call_known at ./compiler/abstractinterpretation.jl:1341
abstract_call at ./compiler/abstractinterpretation.jl:1396
abstract_apply at ./compiler/abstractinterpretation.jl:997
abstract_call_known at ./compiler/abstractinterpretation.jl:1258
abstract_call at ./compiler/abstractinterpretation.jl:1396
abstract_call at ./compiler/abstractinterpretation.jl:1381
abstract_eval_statement at ./compiler/abstractinterpretation.jl:1536
typeinf_local at ./compiler/abstractinterpretation.jl:1901
typeinf_nocycle at ./compiler/abstractinterpretation.jl:2017
_typeinf at ./compiler/typeinfer.jl:226
typeinf at ./compiler/typeinfer.jl:209
.
.
.
# this continues for a very long time - I haven't seen it finish printing yet
julia> versioninfo()
Julia Version 1.8.0-DEV.548
Commit c5f348726c* (2021-09-16 15:09 UTC)
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-12.0.1 (ORCJIT, skylake)
Environment:
JULIA_PKG_SERVER =
JULIA_NUM_THREADS = 4
I'll build 1.7 later today and check if it breaks on there as well, but I'm fairly certain that it will.
Yep, also breaks on 1.7-rc1:
julia> versioninfo()
Julia Version 1.7.0-rc1
Commit 9eade6195e* (2021-09-12 06:45 UTC)
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-12.0.1 (ORCJIT, skylake)
Environment:
JULIA_PKG_SERVER =
JULIA_NUM_THREADS = 4
though this time in egal_types instead of is_derived_type:
Internal error: encountered unexpected error in runtime:
StackOverflowError()
egal_types at ~/julia/src/builtins.c:130
egal_types at ~/julia/src/builtins.c:145
egal_types at ~/julia/src/builtins.c:145
egal_types at ~/julia/src/builtins.c:145
egal_types at ~/julia/src/builtins.c:145
egal_types at ~/julia/src/builtins.c:145
egal_types at ~/julia/src/builtins.c:145
egal_types at ~/julia/src/builtins.c:145
egal_types at ~/julia/src/builtins.c:145
.
.
.
egal_types at ~/julia/src/builtins.c:145
egal_types at ~/julia/src/builtins.c:145
egal_types at ~/julia/src/builtins.c:145 [inlined]
jl_types_egal at ~/julia/src/builtins.c:192
jl_types_equal at ~/julia/src/subtype.c:1916
== at ./operators.jl:248
jfptr_EQ.EQ._15392 at ~/julia/usr/lib/julia/sys.so (unknown line)
_jl_invoke at ~/julia/src/gf.c:2245 [inlined]
jl_apply_generic at ~/julia/src/gf.c:2427
abstract_call_method at ./compiler/abstractinterpretation.jl:408
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:105
abstract_call_known at ./compiler/abstractinterpretation.jl:1319
abstract_call at ./compiler/abstractinterpretation.jl:1374
abstract_apply at ./compiler/abstractinterpretation.jl:975
abstract_call_known at ./compiler/abstractinterpretation.jl:1236
abstract_call at ./compiler/abstractinterpretation.jl:1374
abstract_call at ./compiler/abstractinterpretation.jl:1359
abstract_eval_statement at ./compiler/abstractinterpretation.jl:1514
typeinf_local at ./compiler/abstractinterpretation.jl:1879
typeinf_nocycle at ./compiler/abstractinterpretation.jl:1993
_typeinf at ./compiler/typeinfer.jl:226
typeinf at ./compiler/typeinfer.jl:209
typeinf_edge at ./compiler/typeinfer.jl:823 [inlined]
abstract_call_method at ./compiler/abstractinterpretation.jl:504
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:105
abstract_call_known at ./compiler/abstractinterpretation.jl:1319
abstract_call at ./compiler/abstractinterpretation.jl:1374
abstract_apply at ./compiler/abstractinterpretation.jl:975
abstract_call_known at ./compiler/abstractinterpretation.jl:1236
abstract_call at ./compiler/abstractinterpretation.jl:1374
abstract_call at ./compiler/abstractinterpretation.jl:1359
abstract_eval_statement at ./compiler/abstractinterpretation.jl:1514
typeinf_local at ./compiler/abstractinterpretation.jl:1879
typeinf_nocycle at ./compiler/abstractinterpretation.jl:1993
_typeinf at ./compiler/typeinfer.jl:226
typeinf at ./compiler/typeinfer.jl:209
Dup of https://github.com/JuliaLang/julia/issues/38364 arguably.
Also, I want to highlight this comment: https://github.com/JuliaLang/julia/issues/38364#issuecomment-725558201
There are various issues in the system when dealing with large tuples: if type inference doesn't get you, then subtyping or codegen probably will. Of course we want to fix all of this eventually, but for now you really just have to avoid big tuples.
Kind of? The MWE by @martinholters at least throws a sensible StackOverflowError and not an internal error for me:
julia> xs = tuple(("a" for _ in 1:2000)...);
julia> foo(xs) = xs[1:20]
foo (generic function with 1 method)
julia> @code_typed foo(xs)
ERROR: StackOverflowError:
Stacktrace:
[1] _methods_by_ftype
@ ./reflection.jl:908 [inlined]
[2] #findall#246
...
[18967] _typeinf(interp::Core.Compiler.NativeInterpreter, frame::Core.Compiler.InferenceState)
@ Core.Compiler ./compiler/typeinfer.jl:226
[18968] typeinf(interp::Core.Compiler.NativeInterpreter, frame::Core.Compiler.InferenceState)
@ Core.Compiler ./compiler/typeinfer.jl:209
[18969] typeinf_code(interp::Core.Compiler.NativeInterpreter, method::Method, atypes::Any, sparams::Core.SimpleVector, run_optimizer::Bool)
@ Core.Compiler ./compiler/typeinfer.jl:845
[18970] code_typed_by_type(tt::Type; optimize::Bool, debuginfo::Symbol, world::UInt64, interp::Core.Compiler.NativeInterpreter)
@ Base ./reflection.jl:1213
[18971] code_typed(f::Any, types::Any; optimize::Bool, debuginfo::Symbol, world::UInt64, interp::Core.Compiler.NativeInterpreter)
@ Base ./reflection.jl:1181
[18972] code_typed(f::Any, types::Any)
@ Base ./reflection.jl:1168
though the original example by @fonsp also throws the internal error (though a different one that also ends in type inference).
I'd be happy with this getting a non-internal error and otherwise being a duplicate (though I'm not sure the cause is the same, as the errors seem to be handled differently...). The use case in the reproducer in my OP, while valid code, can be worked around on my end by not creating those large tuples internally.
I guess the difference between the two issues is that in my case, the code creating those Flatten actually runs fine and inference is only hit once the result is iterated over, while in the other issue inference immediately throws. If anything, this could indicate that maybe Base.Iterators.flatten shouldn't splat here as well 🤷♂️
Like the comment says, the exact point where things explode might vary but in the end, it still gets you.