RuntimeGeneratedFunctions.jl
RuntimeGeneratedFunctions.jl copied to clipboard
Scary error when generating large number of RuntimeGeneratedFunctions
The MWE fills an array with simple generated functions and calls them in a loop. Works for me when n=50_000, but not when n=65535.
julia> using RuntimeGeneratedFunctions
RuntimeGeneratedFunctions.init(@__MODULE__)
function genfuncs(n)
fs = Any[]
for i=1:n
a = rand()
ex = :(function f(x)
return x * $a
end)
f = @RuntimeGeneratedFunction(ex)
push!(fs, f)
end
return fs
end
function calc(fs)
sum = 0.0
for i = eachindex(fs)
sum += fs[i](42.0)
end
return sum
end
fs = genfuncs(100_000)
calc(fs)
ERROR: UndefVarError: mul_float not defined
Stacktrace:
[1] * at ./float.jl:405 [inlined]
[2] macro expansion at ./REPL[1]:9 [inlined]
[3] macro expansion at /home/krisztian/.julia/packages/RuntimeGeneratedFunctions/fIcZp/src/RuntimeGeneratedFunctions.jl:80 [inlined]
[4] macro expansion at ./none:0 [inlined]
[5] generated_callfunc at ./none:0 [inlined]
[6] (::RuntimeGeneratedFunctions.RuntimeGeneratedFunction{var"#_RuntimeGeneratedFunctions_ModTag",(0x92, 0xb5, 0xf6, 0xed, 0xbb, 0x56, 0xdf, 0xe8, 0x21, 0x03, 0xee, 0xa3, 0xe5, 0x65, 0x4c, 0xf8, 0x80, 0xb3, 0xe2, 0x57, 0x59, 0x5a, 0x10, 0x91, 0xd3, 0x39, 0xc9, 0xe2, 0x7b, 0xc5, 0x09, 0xcd, 0xbf, 0xa5, 0xcd, 0x04, 0xa1, 0x99, 0xac, 0x0d, 0xf0, 0x49, 0x08, 0x53, 0xf3, 0x01, 0x0e, 0xff, 0x94, 0x1e, 0x19, 0x76, 0xd1, 0x78, 0xe1, 0x43, 0xc3, 0xa5, 0x47, 0xc1, 0xda, 0x0b, 0x8a, 0xf4),(:x,)})(::Float64) at /home/krisztian/.julia/packages/RuntimeGeneratedFunctions/fIcZp/src/RuntimeGeneratedFunctions.jl:68
[7] calc(::Array{Any,1}) at ./REPL[1]:20
[8] top-level scope at REPL[1]:26
julia> versioninfo()
Julia Version 1.5.2
Commit 539f3ce943 (2020-09-23 23:17 UTC)
Platform Info:
OS: Linux (x86_64-pc-linux-gnu)
CPU: Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-9.0.1 (ORCJIT, skylake)
Uhh... does Julia start to discard functions if we define too many? I'll have to ask some people. This is odd to say the least.
I spent an hour looking into this and reading the C code for method specializations in the runtime.
I wasn't able to find anything obvious yet. One interesting thing I noticed was that inspecting one of the failing functions with Cthulhu.jl seemed to work fine. But then I got different inconsistent UndefVarErrors such as
julia> fs[65534](1.0)
ERROR: UndefVarError: REPL[3] not defined
and
julia> fs[65533](1.0)
ERROR: UndefVarError: /home/chris/.julia/packages/RuntimeGeneratedFunctions/KrkGo/src/RuntimeGeneratedFunctions.jl not defined
which does suggest some weird corruption in the Julia runtime.
I thought this might have something to do with the large number of method specializations of RuntimeGeneratedFunctions.generated_callfunc that we cause the compiler to generate here. After playing around a bit in the REPL I also see that the specializations list has grown quite a lot bigger than the boundary approximately around 2^16 where the failures seem to start. Not sure what to make of that yet.
julia> length(methods(RuntimeGeneratedFunctions.generated_callfunc).ms[1].specializations)
132387
This very much looks like an UInt16 overflow, since
julia> fs[2](1.0) == fs[65538](1.0)
true
julia> fs[3](1.0) == fs[65539](1.0)
true
julia> fs[4](1.0) == fs[65540](1.0)
true
65532-65537 aren't callable:
┌ Error: iteration 65532
│ exception = UndefVarError: mul_float not defined
└ @ Main REPL[14]:8
┌ Error: iteration 65534
│ exception = mul_float: types of a and b must match
└ @ Main REPL[14]:8
┌ Error: iteration 65535
│ exception = mul_float: types of a and b must match
└ @ Main REPL[14]:8
┌ Error: iteration 65536
│ exception = UndefVarError: /home/pfitzseb/.julia/packages/RuntimeGeneratedFunctions/KrkGo/src/RuntimeGeneratedFunctions.jl not defined
└ @ Main REPL[14]:8
┌ Error: iteration 65537
│ exception = UndefVarError: REPL[3] not defined
└ @ Main REPL[14]:8
Jeff suggested running a debug build to check whether this code trips any assertions, and indeed, it does:
julia> calc(fs)
10000
20000
30000
40000
50000
60000
julia-debug: /home/pfitzseb/Documents/Git/julia/src/ircode.c:344: jl_encode_value_: Assertion `id <= UINT16_MAX' failed.
https://github.com/JuliaLang/julia/blob/902a5c199d590552758f7a91cc75c47ea67de5f2/src/ircode.c#L344
I spend a few minutes trying to adjust a bunch of ints to size_ts, but that didn't work out so well.
Patch
diff --git a/src/ircode.c b/src/ircode.c
index 73e99f2281..05cdb02533 100644
--- a/src/ircode.c
+++ b/src/ircode.c
@@ -42,7 +42,7 @@ static void tagged_root(rle_reference *rr, jl_ircode_state *s, int i)
static void literal_val_id(rle_reference *rr, jl_ircode_state *s, jl_value_t *v) JL_GC_DISABLED
{
jl_array_t *rs = s->method->roots;
- int i, l = jl_array_len(rs);
+ size_t i, l = jl_array_len(rs);
if (jl_is_symbol(v) || jl_is_concrete_type(v)) {
for (i = 0; i < l; i++) {
if (jl_array_ptr_ref(rs, i) == v)
@@ -341,9 +341,9 @@ static void jl_encode_value_(jl_ircode_state *s, jl_value_t *v, int as_literal)
write_uint8(s->s, id);
}
else {
- assert(id <= UINT16_MAX);
+ assert(id <= UINT32_MAX);
write_uint8(s->s, TAG_LONG_METHODROOT);
- write_uint16(s->s, id);
+ write_uint32(s->s, id);
}
return;
}
@@ -603,11 +603,11 @@ static jl_value_t *jl_decode_value(jl_ircode_state *s) JL_GC_DISABLED
key = read_uint64(s->s);
tag = read_uint8(s->s);
assert(tag == TAG_METHODROOT || tag == TAG_LONG_METHODROOT);
- return lookup_root(s->method, key, tag == TAG_METHODROOT ? read_uint8(s->s) : read_uint16(s->s));
+ return lookup_root(s->method, key, tag == TAG_METHODROOT ? read_uint8(s->s) : read_uint32(s->s));
case TAG_METHODROOT:
return lookup_root(s->method, 0, read_uint8(s->s));
case TAG_LONG_METHODROOT:
- return lookup_root(s->method, 0, read_uint16(s->s));
+ return lookup_root(s->method, 0, read_uint32(s->s));
case TAG_SVEC: JL_FALLTHROUGH; case TAG_LONG_SVEC:
return jl_decode_value_svec(s, tag);
case TAG_COMMONSYM:
diff --git a/src/julia_internal.h b/src/julia_internal.h
index 21c46129b7..2ee6ed073c 100644
--- a/src/julia_internal.h
+++ b/src/julia_internal.h
@@ -538,7 +538,7 @@ void jl_resolve_globals_in_ir(jl_array_t *stmts, jl_module_t *m, jl_svec_t *spar
JL_DLLEXPORT void jl_add_method_root(jl_method_t *m, jl_module_t *mod, jl_value_t* root);
void jl_append_method_roots(jl_method_t *m, uint64_t modid, jl_array_t* roots);
int get_root_reference(rle_reference *rr, jl_method_t *m, size_t i);
-jl_value_t *lookup_root(jl_method_t *m, uint64_t key, int index);
+jl_value_t *lookup_root(jl_method_t *m, uint64_t key, size_t index);
int nroots_with_key(jl_method_t *m, uint64_t key);
int jl_valid_type_param(jl_value_t *v);
diff --git a/src/method.c b/src/method.c
index 7325670bd7..3df10f20fe 100644
--- a/src/method.c
+++ b/src/method.c
@@ -404,7 +404,7 @@ static void jl_code_info_set_ir(jl_code_info_t *li, jl_expr_t *ir)
// Flags that need to be copied to slotflags
const uint8_t vinfo_mask = 8 | 16 | 32 | 64;
- int i;
+ size_t i;
for (i = 0; i < nslots; i++) {
jl_value_t *vi = jl_array_ptr_ref(vis, i);
jl_sym_t *name = (jl_sym_t*)jl_array_ptr_ref(vi, 0);
@@ -590,7 +590,7 @@ JL_DLLEXPORT jl_code_info_t *jl_code_for_staged(jl_method_instance_t *linfo)
// If this generated function has an opaque closure, cache it for
// correctness of method identity
- for (int i = 0; i < jl_array_len(func->code); ++i) {
+ for (size_t i = 0; i < jl_array_len(func->code); ++i) {
jl_value_t *stmt = jl_array_ptr_ref(func->code, i);
if (jl_is_expr(stmt) && ((jl_expr_t*)stmt)->head == jl_new_opaque_closure_sym) {
linfo->uninferred = jl_copy_ast((jl_value_t*)func);
@@ -1090,7 +1090,7 @@ int get_root_reference(rle_reference *rr, jl_method_t *m, size_t i)
// get a root, given its key and index relative to the key
// this is the relocatable way to get a root from m->roots
-jl_value_t *lookup_root(jl_method_t *m, uint64_t key, int index)
+jl_value_t *lookup_root(jl_method_t *m, uint64_t key, size_t index)
{
if (!m->root_blocks) {
assert(key == 0);
diff --git a/src/support/rle.h b/src/support/rle.h
index f85d9f35c4..3d47d15bb8 100644
--- a/src/support/rle.h
+++ b/src/support/rle.h
@@ -34,7 +34,7 @@ int rle_iter_increment(rle_iter_state *state, /* number of items */ size_t len,
/* indexing */
typedef struct {
uint64_t key;
- int index; // number of preceding items in the list with the same key
+ size_t index; // number of preceding items in the list with the same key
} rle_reference;
void rle_index_to_reference(rle_reference *rr, /* item index */ size_t i, uint64_t *rletable, size_t npairs, uint64_t key0);
This fails with
julia> fs = genfuncs(70_000);
julia-debug: /home/pfitzseb/Documents/Git/julia/src/julia.h:989: jl_array_ptr_ref: Assertion `i < jl_array_len(a)' failed.
signal (6): Aborted
in expression starting at REPL[5]:1
__pthread_kill_implementation at /usr/lib/libc.so.6 (unknown line)
raise at /usr/lib/libc.so.6 (unknown line)
abort at /usr/lib/libc.so.6 (unknown line)
__assert_fail_base.cold at /usr/lib/libc.so.6 (unknown line)
__assert_fail at /usr/lib/libc.so.6 (unknown line)
jl_array_ptr_ref at /home/pfitzseb/Documents/Git/julia/src/julia.h:989
lookup_root at /home/pfitzseb/Documents/Git/julia/src/method.c:1097
jl_decode_value at /home/pfitzseb/Documents/Git/julia/src/ircode.c:610
jl_decode_value_expr at /home/pfitzseb/Documents/Git/julia/src/ircode.c:512
jl_decode_value at /home/pfitzseb/Documents/Git/julia/src/ircode.c:630
jl_decode_value_array at /home/pfitzseb/Documents/Git/julia/src/ircode.c:455
jl_decode_value at /home/pfitzseb/Documents/Git/julia/src/ircode.c:625
ijl_uncompress_ir at /home/pfitzseb/Documents/Git/julia/src/ircode.c:835
and I didn't figure out why m->roots isn't properly growing.
The more realistic and immediate solution is probably to change init to pre-create multiple generated functions based on an additional user-defined parameter.
https://github.com/JuliaLang/julia/pull/45173 fixes this for (at least) n = 2e5.